The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.
483.1K Pulls 196 Tags Updated 9 days ago
Meta's Llama 3.2 goes small with 1B and 3B models.
3.3M Pulls 63 Tags Updated 8 weeks ago
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
10.6M Pulls 93 Tags Updated 2 months ago
The 7B model released by Mistral AI, updated to version 0.3.
5.6M Pulls 84 Tags Updated 4 months ago
Qwen2 is a new series of large language models from Alibaba group
3.9M Pulls 97 Tags Updated 2 months ago
Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.
2M Pulls 133 Tags Updated 2 months ago
A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.
550.7K Pulls 17 Tags Updated 3 months ago
A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.
485.5K Pulls 69 Tags Updated 4 months ago
Command R is a Large Language Model optimized for conversational interaction and long context tasks.
243.9K Pulls 32 Tags Updated 2 months ago
Command R+ is a powerful, scalable large language model purpose-built to excel at real-world enterprise use cases.
105.3K Pulls 21 Tags Updated 2 months ago
Mistral Large 2 is Mistral's new flagship model that is significantly more capable in code generation, mathematics, and reasoning with 128k context window and support for dozens of languages.
101.8K Pulls 32 Tags Updated yesterday
Hermes 3 is the latest version of the flagship Hermes series of LLMs by Nous Research
58.4K Pulls 49 Tags Updated 2 months ago
Mistral Small is a lightweight model designed for cost-effective use in tasks like translation and summarization.
46.5K Pulls 17 Tags Updated 2 months ago
A commercial-friendly small language model by NVIDIA optimized for roleplay, RAG QA, and function calling.
39.6K Pulls 17 Tags Updated 2 months ago
A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.
37.9K Pulls 33 Tags Updated 3 months ago
Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.
33.8K Pulls 17 Tags Updated 5 weeks ago
The IBM Granite 2B and 8B models are designed to support tool-based use cases and support for retrieval augmented generation (RAG), streamlining code generation, translation and bug fixing.
22.7K Pulls 33 Tags Updated yesterday
The IBM Granite 1B and 3B models are the first mixture of experts (MoE) Granite models from IBM designed for low latency usage.
15K Pulls 33 Tags Updated yesterday
SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters.
14.8K Pulls 49 Tags Updated 2 weeks ago
An open weights function calling model based on Llama 3, competitive with GPT-4o function calling capabilities.
14.8K Pulls 17 Tags Updated 4 months ago
Cohere For AI's language models trained to perform well across 23 different languages.
13.3K Pulls 33 Tags Updated 3 weeks ago
Athene-V2 is a 72B parameter model which excels at code completion, mathematics, and log extraction tasks.
1,342 Pulls 17 Tags Updated 5 days ago