Meta's Llama 3.2 goes small with 1B and 3B models.
2.5M Pulls 63 Tags Updated 6 weeks ago
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
9M Pulls 93 Tags Updated 2 months ago
Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.
1.9M Pulls 133 Tags Updated 7 weeks ago
A commercial-friendly small language model by NVIDIA optimized for roleplay, RAG QA, and function calling.
35.2K Pulls 17 Tags Updated 7 weeks ago
Mistral Small is a lightweight model designed for cost-effective use in tasks like translation and summarization.
41.8K Pulls 17 Tags Updated 7 weeks ago
A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.
494.2K Pulls 17 Tags Updated 3 months ago
The 7B model released by Mistral AI, updated to version 0.3.
5.1M Pulls 84 Tags Updated 3 months ago
A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.
475.5K Pulls 69 Tags Updated 3 months ago
Command R is a Large Language Model optimized for conversational interaction and long context tasks.
239.7K Pulls 32 Tags Updated 2 months ago
Command R+ is a powerful, scalable large language model purpose-built to excel at real-world enterprise use cases.
103.9K Pulls 21 Tags Updated 2 months ago
Qwen2 is a new series of large language models from Alibaba group
3.9M Pulls 97 Tags Updated 2 months ago
The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.
273.2K Pulls 67 Tags Updated 4 weeks ago
Mistral Large 2 is Mistral's new flagship model that is significantly more capable in code generation, mathematics, and reasoning with 128k context window and support for dozens of languages.
99.2K Pulls 17 Tags Updated 3 months ago
Hermes 3 is the latest version of the flagship Hermes series of LLMs by Nous Research
55.6K Pulls 49 Tags Updated 2 months ago
A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.
36.4K Pulls 33 Tags Updated 3 months ago
Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.
29.6K Pulls 17 Tags Updated 3 weeks ago
The IBM Granite 2B and 8B models are designed to support tool-based use cases and support for retrieval augmented generation (RAG), streamlining code generation, translation and bug fixing.
18.2K Pulls 33 Tags Updated 2 weeks ago
An open weights function calling model based on Llama 3, competitive with GPT-4o function calling capabilities.
14.1K Pulls 17 Tags Updated 3 months ago
The IBM Granite 1B and 3B models are the first mixture of experts (MoE) Granite models from IBM designed for low latency usage.
12.2K Pulls 33 Tags Updated 2 weeks ago
Cohere For AI's language models trained to perform well across 23 different languages.
10.6K Pulls 33 Tags Updated 2 weeks ago
SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters.
10.3K Pulls 49 Tags Updated 9 days ago