Mistral Medium 3.5 is the first flagship model of Mistral AI that merged instruction-following, reasoning, and coding in a single set of 128B weights.
29.7K Pulls 5 Tags Updated 3 weeks ago
An update to Mistral Small that improves on function calling, instruction following, and less repetition errors.
2.2M Pulls 5 Tags Updated 11 months ago
Granite 4 features improved instruction following (IF) and tool-calling capabilities, making them more effective in enterprise applications.
1.2M Pulls 17 Tags Updated 7 months ago
The Cogito v2.1 LLMs are instruction tuned generative models. All models are released under MIT license for commercial use.
194.6K Pulls 6 Tags Updated 6 months ago
A state-of-the-art mixture-of-experts (MoE) language model. Kimi K2-Instruct-0905 demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks.
75K Pulls 1 Tag Updated 8 months ago
Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.
4.6M Pulls 9 Tags Updated 1 year ago
Dolphin 3.0 Llama 3.1 8B 🐬 is the next generation of the Dolphin series of instruct-tuned models designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.
3.8M Pulls 5 Tags Updated 1 year ago
IBM Granite 2B and 8B models are 128K context length language models that have been fine-tuned for improved reasoning and instruction-following capabilities.
1M Pulls 3 Tags Updated 1 year ago
A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.
2.3M Pulls 4 Tags Updated 2 years ago
Dolphin 2.9 is a new model with 8B and 70B sizes by Eric Hartford based on Llama 3 that has a variety of instruction, conversational, and coding skills.
1.9M Pulls 53 Tags Updated 2 years ago
ShieldGemma is set of instruction tuned models for evaluating the safety of text prompt input and text output responses against a set of defined safety policies.
876.7K Pulls 49 Tags Updated 1 year ago
Stable Code 3B is a coding model with instruct and code completion variants on par with models such as Code Llama 7B that are 2.5x larger.
1M Pulls 36 Tags Updated 2 years ago
Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.
582.2K Pulls 17 Tags Updated 1 year ago
EXAONE 3.5 is a collection of instruction-tuned bilingual (English and Korean) generative models ranging from 2.4B to 32B parameters, developed and released by LG AI Research.
507.7K Pulls 13 Tags Updated 1 year ago
Nexus Raven is a 13B instruction tuned model for function calling tasks.
858.3K Pulls 32 Tags Updated 2 years ago
Tülu 3 is a leading instruction following model family, offering fully open-source data, code, and recipes by the The Allen Institute for AI.
365.4K Pulls 9 Tags Updated 1 year ago
🎩 Magicoder is a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.
532.3K Pulls 18 Tags Updated 2 years ago
A new small reasoning model fine-tuned from the Qwen 2.5 3B Instruct model.
246.9K Pulls 5 Tags Updated 1 year ago
A high-performing code instruct model created by merging two existing code models.
476.1K Pulls 16 Tags Updated 2 years ago
An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
276.8K Pulls 7 Tags Updated 1 year ago