Smollm

Smollm 360M Base - Details

Last update on 2025-05-18

Smollm 360M Base is a large language model developed by Hugging Face Smol Models Research Enterprise with 360m parameters. It operates under the Apache License 2.0 (Apache-2.0) and is designed to optimize data curation and architecture for enhanced performance in small to medium-sized models.

Description of Smollm 360M Base

SmolLM is a series of state-of-the-art small language models available in three sizes: 135M, 360M, and 1.7B parameters. These models are trained on Cosmo-Corpus, a high-quality dataset comprising Cosmopedia v2 (28B tokens of synthetic textbooks and stories generated by Mixtral), Python-Edu (4B tokens of educational Python samples from The Stack), and FineWeb-Edu (220B tokens of deduplicated educational web samples from FineWeb). SmolLM models have demonstrated strong performance in benchmarks testing common sense reasoning and world knowledge, outperforming other models in their size categories.

Parameters & Context Length of Smollm 360M Base

4k

Smollm 360M Base has 360m parameters, placing it in the small model category, which ensures fast and resource-efficient performance for tasks requiring moderate complexity. Its 4k context length falls under short contexts, making it suitable for concise tasks but limiting its ability to process extended texts. This combination of parameter size and context length balances efficiency and capability, ideal for applications where speed and simplicity are prioritized over handling very long sequences.

  • Name: Smollm 360M Base
  • Parameter_Size: 360m
  • Context_Length: 4k
  • Implications: Small model efficiency, short context limitations, optimized for speed and simplicity.

Possible Intended Uses of Smollm 360M Base

code generation question answering text completion

The Smollm 360M Base model presents possible applications in areas such as code generation, where it could assist with writing or optimizing code snippets, though further testing would be needed to confirm its effectiveness. It also offers possible utility in text completion, potentially helping users draft or extend short texts, but this would require validation across different contexts. Additionally, it could serve as a possible tool for question answering, providing concise responses to specific queries, though its performance in this role would depend on the complexity of the questions and the quality of training data. These possible uses highlight the model’s flexibility but emphasize the need for careful evaluation before deployment.

  • Name: Smollm 360M Base
  • Intended_Uses: code generation, text completion, question answering
  • Purpose: potential applications in coding, text drafting, and answering specific questions
  • Important: these uses require thorough investigation and testing.

Possible Applications of Smollm 360M Base

code assistant content summarizer question answering system text completion tool

The Smollm 360M Base model could have possible applications in areas like code generation, where it might assist with writing or refining code snippets, though further testing would be needed to confirm its reliability. It could also be possible to use it for text completion, potentially helping users draft or expand short texts, but this would require validation across different scenarios. Additionally, it might be possible to leverage it for question answering, providing concise responses to specific queries, though its effectiveness would depend on the complexity of the questions. Another possible use could be content summarization, where it might condense longer texts into shorter forms, though this would need thorough evaluation. These possible applications highlight the model’s versatility but underscore the need for careful assessment before implementation.

  • Name: Smollm 360M Base
  • Applications: code generation, text completion, question answering, content summarization
  • Important: Each application must be thoroughly evaluated and tested before use.

Quantized Versions & Hardware Requirements of Smollm 360M Base

32 ram 8 vram 4 vram

The Smollm 360M Base model’s medium q4 version requires at least 8GB of VRAM for efficient operation, making it suitable for systems with mid-range GPUs. This quantization balances precision and performance, allowing possible deployment on devices with 4GB–8GB VRAM for lighter tasks, though 8GB is recommended for smoother execution. A multi-core CPU and at least 32GB of system RAM are also advised, along with adequate cooling and power supply. These possible requirements ensure compatibility with a range of hardware setups while maintaining reasonable performance.

  • Name: Smollm 360M Base
  • Quantized_Versions: fp16, q2, q3, q4, q5, q6, q8

Conclusion

Smollm 360M Base is a large language model developed by Hugging Face Smol Models Research Enterprise with 360m parameters, operating under the Apache License 2.0 and optimized for small to medium-sized tasks. It features a 4k context length, balancing efficiency and performance for applications requiring moderate complexity.

References

Huggingface Model Page
Ollama Model Page

Benchmarks

Benchmark Name Score
Instruction Following Evaluation (IFEval) 21.34
Big Bench Hard (BBH) 3.28
Mathematical Reasoning Test (MATH Lvl 5) 1.13
General Purpose Question Answering (GPQA) 2.35
Multimodal Understanding and Reasoning (MUSR) 8.09
Massive Multitask Language Understanding (MMLU-PRO) 1.37
Link: Huggingface - Open LLM Leaderboard
Benchmark Graph
Maintainer
Parameters & Context Length
  • Parameters: 360m
  • Context Length: 4K
Statistics
  • Huggingface Likes: 62
  • Huggingface Downloads: 32K
Intended Uses
  • Code Generation
  • Text Completion
  • Question Answering
Languages
  • English