
Orca Mini 3B

Orca Mini 3B is a large language model developed by the community under the Psmathur-Orca initiative. It features a 3b parameter size, making it suitable for a wide range of applications. The model is released under the Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0) license, ensuring it can be used and modified with proper attribution while restricting commercial exploitation. Designed with advanced architectures, it focuses on optimizing performance across diverse tasks.
Description of Orca Mini 3B
Orca Mini 3B is a 3b parameter large language model developed by the Psmathur-Orca community. It is trained on explain-tuned datasets using methodologies from the Orca Research Paper, incorporating data from WizardLM, Alpaca, and Dolly-V2. The model leverages 8x A100(80G) GPUs for training with 3 epochs and a max sequence length of 1024. Designed for instruction following and learning thought processes from ChatGPT, it operates under the Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0) license, ensuring open use with attribution while restricting commercial exploitation. Its architecture emphasizes versatility for diverse applications.
Parameters & Context Length of Orca Mini 3B
Orca Mini 3B is a 3b parameter model with a 1k context length, placing it in the small-scale category for parameters and short-context range for context length. The 3b parameter size ensures resource efficiency and fast inference, making it ideal for lightweight tasks, while the 1k context length supports concise interactions but limits its ability to handle extended texts. These choices reflect a balance between accessibility and functionality, prioritizing simplicity and speed over handling highly complex or lengthy inputs.
- Name: Orca Mini 3B
- Parameter_Size: 3b
- Context_Length: 1k
- Implications: Efficient for simple tasks, limited context for long texts.
Possible Intended Uses of Orca Mini 3B
Orca Mini 3B is a 3b parameter model designed for instruction following and thought process learning, with possible uses including answering complex questions through detailed reasoning, generating structured text based on system prompts, and assisting with code generation and execution. These possible applications could support tasks requiring logical analysis, content creation, or programming guidance, though their effectiveness would depend on specific implementation and validation. The model’s architecture and training data suggest it might be adapted for scenarios where structured output and step-by-step problem-solving are needed, but further research is required to confirm its suitability for such roles.
- Name: Orca Mini 3B
- Intended_Uses: answering complex questions with detailed reasoning, generating structured text based on system prompts, code generation and execution assistance
Possible Applications of Orca Mini 3B
Orca Mini 3B is a 3b parameter model with possible applications in areas such as answering complex questions with detailed reasoning, generating structured text based on system prompts, code generation and execution assistance, and supporting educational tools. These possible uses could include tasks requiring logical analysis, content creation, or programming guidance, though their effectiveness would depend on specific implementation and validation. The model’s design suggests it might be adapted for scenarios where structured output and step-by-step problem-solving are needed, but further research is required to confirm its suitability for such roles. Each possible application must be thoroughly evaluated and tested before deployment to ensure alignment with specific requirements.
- answering complex questions with detailed reasoning
- generating structured text based on system prompts
- code generation and execution assistance
- supporting educational tools
Quantized Versions & Hardware Requirements of Orca Mini 3B
The medium q4 version of Orca Mini 3B requires a GPU with at least 12GB VRAM and 32GB system memory, making it suitable for mid-range graphics cards. The available quantized versions include fp16, q2, q3, q4, q5, q6, and q8.
Conclusion
Orca Mini 3B is a 3b parameter large language model developed by the Psmathur-Orca community, released under the Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0) license, trained on explain-tuned datasets like WizardLM and Alpaca, with a 1k context length and multiple quantized versions including q4 for balanced performance. It is designed for instruction following and diverse applications, with hardware requirements suitable for mid-range GPUs and system memory.
References
Benchmarks
Benchmark Name | Score |
---|---|
Instruction Following Evaluation (IFEval) | 7.42 |
Big Bench Hard (BBH) | 4.69 |
Mathematical Reasoning Test (MATH Lvl 5) | 0.83 |
General Purpose Question Answering (GPQA) | 0.00 |
Multimodal Understanding and Reasoning (MUSR) | 4.20 |
Massive Multitask Language Understanding (MMLU-PRO) | 1.61 |
