Bge-Large

Bge Large 335M - Details

Last update on 2025-05-18

Bge Large 335M is a large language model developed by the Beijing Academy of Artificial Intelligence, a nonprofit organization. It features 335 million parameters and is released under the MIT License.

Description of Bge Large 335M

FlagEmbedding is a system designed to map any text to low-dimensional dense vectors, enabling tasks such as retrieval, classification, clustering, or semantic search. It includes models like BAAI/bge-large-en-v1.5 and others, optimized for embedding tasks. The system is integrated into frameworks like Langchain, enhancing its applicability in real-world applications.

Parameters & Context Length of Bge Large 335M

0b 32k 8k

The Bge Large 335M model has 335m parameters and a 32k token context length, making it well-suited for tasks requiring efficient processing of long texts while maintaining reasonable computational demands. The 335m parameter size places it in the small to mid-scale category, offering resource-efficient performance for tasks like semantic search or classification, though it may lack the complexity of larger models. The 32k token context length enables handling extended documents or conversations, balancing depth and practicality for applications needing moderate to long input.

  • Parameter_Size: 335m
  • Context_Length: 32k

Possible Intended Uses of Bge Large 335M

document classification text retrieval semantic search classification retrieval

The Bge Large 335M model is designed for tasks involving text embedding, with possible applications in retrieval, classification, clustering, semantic search, and vector databases for large language models. Its 335m parameter size and 32k token context length suggest it could handle complex text analysis, though possible use cases may vary depending on specific requirements. The model supports English and Chinese, making it suitable for monolingual tasks where language-specific optimization is critical. While possible uses include improving search efficiency or organizing large datasets, these applications require further validation to ensure compatibility with real-world scenarios. The model’s design emphasizes flexibility, but possible benefits or limitations in specialized contexts remain to be explored.

  • retrieval
  • classification
  • clustering
  • semantic search
  • vector databases for llms

Possible Applications of Bge Large 335M

multi-lingual assistant text processing semantic analysis cross-lingual content categorization e-commerce search enhancement

The Bge Large 335M model has possible applications in tasks like retrieval, classification, clustering, and semantic search, where its 335m parameters and 32k token context length could support efficient text analysis. Possible uses might include organizing large datasets, improving search accuracy, or enhancing document categorization, though these possible scenarios require thorough testing to confirm effectiveness. The model’s monolingual design for English and Chinese could make it suitable for language-specific tasks, but possible limitations in multilingual settings remain to be explored. Possible integration into vector databases for LLMs could also be investigated, though validation is essential. Each possible application must be thoroughly evaluated and tested before deployment to ensure alignment with specific requirements.

  • retrieval
  • classification
  • clustering
  • semantic search
  • vector databases for llms

Quantized Versions & Hardware Requirements of Bge Large 335M

8 vram 16 ram 4 vram

The Bge Large 335M model’s fp16 quantized version offers a balance between precision and performance, requiring a GPU with at least 8GB VRAM for efficient execution, though it can also run on a CPU with sufficient memory. This version is suitable for systems with moderate hardware capabilities, as the 335m parameter size falls within the 1B parameters category, where VRAM requirements range from ~4GB to 8GB. Possible applications may vary based on specific use cases, and hardware compatibility should be verified.

  • fp16

Conclusion

Bge Large 335M is a large language model developed by the Beijing Academy of Artificial Intelligence, a nonprofit organization, featuring 335 million parameters and released under the MIT License. It is optimized for tasks like retrieval, classification, clustering, and semantic search, supporting English and Chinese, with potential applications in text embedding and vector database integration.

References

Huggingface Model Page
Ollama Model Page

Maintainer
Parameters & Context Length
  • Parameters: 335m
  • Context Length: 32K
Statistics
  • Huggingface Likes: 212
  • Huggingface Downloads: 500K
Intended Uses
  • Retrieval
  • Classification
  • Clustering
  • Semantic Search
  • Vector Databases For Llms
Languages
  • English
  • Chinese