-
llama3.1
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
tools 8b 70b 405b114M Pulls 93 Tags Updated 1 year ago
-
llama3.2
Meta's Llama 3.2 goes small with 1B and 3B models.
tools 1b 3b68.2M Pulls 63 Tags Updated 1 year ago
-
mistral
The 7B model released by Mistral AI, updated to version 0.3.
tools 7b29M Pulls 84 Tags Updated 9 months ago
-
llama3
Meta Llama 3: The most capable openly available LLM to date
8b 70b23.7M Pulls 68 Tags Updated 1 year ago
-
llava
🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
vision 7b 13b 34b14M Pulls 98 Tags Updated 2 years ago
-
mxbai-embed-large
State-of-the-art large embedding model from mixedbread.ai
embedding 335m10.3M Pulls 4 Tags Updated 1 year ago
-
llama2
Llama 2 is a collection of foundation language models ranging from 7B to 70B parameters.
7b 13b 70b6.8M Pulls 102 Tags Updated 2 years ago
-
codellama
A large language model that can use text prompts to generate and discuss code.
7b 13b 34b 70b5.5M Pulls 199 Tags Updated 1 year ago
-
tinyllama
The TinyLlama project is an open endeavor to train a compact 1.1B Llama model on 3 trillion tokens.
1.1b4.8M Pulls 36 Tags Updated 2 years ago
-
llama3.2-vision
Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.
vision 11b 90b4.5M Pulls 9 Tags Updated 11 months ago
-
mistral-nemo
A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.
tools 12b4.3M Pulls 17 Tags Updated 9 months ago
-
llama3.3
New state of the art 70B model. Llama 3.3 70B offers similar performance compared to the Llama 3.1 405B model.
tools 70b3.8M Pulls 14 Tags Updated 1 year ago
-
dolphin3
Dolphin 3.0 Llama 3.1 8B 🐬 is the next generation of the Dolphin series of instruct-tuned models designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.
8b3.8M Pulls 5 Tags Updated 1 year ago
-
olmo2
OLMo 2 is a new family of 7B and 13B models trained on up to 5T tokens. These models are on par with or better than equivalently sized fully open models, and competitive with open-weight models such as Llama 3.1 on English academic benchmarks.
7b 13b3.7M Pulls 9 Tags Updated 1 year ago
-
qwen3-vl
The most powerful vision-language model in the Qwen model family to date.
vision tools thinking cloud 2b 4b 8b 30b 32b 235b3.6M Pulls 59 Tags Updated 6 months ago
-
smollm2
SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters.
tools 135m 360m 1.7b3.4M Pulls 49 Tags Updated 1 year ago
-
snowflake-arctic-embed
A suite of text embedding models by Snowflake, optimized for performance.
embedding 22m 33m 110m 137m 335m3.1M Pulls 16 Tags Updated 2 years ago
-
all-minilm
Embedding models on very large sentence level datasets.
embedding 22m 33m3M Pulls 10 Tags Updated 1 year ago
-
mistral-small
Mistral Small 3 sets a new benchmark in the “small” Large Language Models category below 70B.
tools 22b 24b3M Pulls 21 Tags Updated 1 year ago
-
mixtral
A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.
tools 8x7b 8x22b2.6M Pulls 70 Tags Updated 1 year ago
-
falcon3
A family of efficient AI models under 10B parameters performant in science, math, and coding through innovative training techniques.
1b 3b 7b 10b2.6M Pulls 17 Tags Updated 1 year ago
-
llama2-uncensored
Uncensored Llama 2 model by George Sung and Jarrad Hope.
7b 70b2.5M Pulls 34 Tags Updated 2 years ago
-
llava-llama3
A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.
vision 8b2.3M Pulls 4 Tags Updated 1 year ago
-
mistral-small3.2
An update to Mistral Small that improves on function calling, instruction following, and less repetition errors.
vision tools 24b1.9M Pulls 5 Tags Updated 10 months ago
-
qwen2.5vl
Flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL.
vision 3b 7b 32b 72b1.9M Pulls 17 Tags Updated 11 months ago
-
dolphin-llama3
Dolphin 2.9 is a new model with 8B and 70B sizes by Eric Hartford based on Llama 3 that has a variety of instruction, conversational, and coding skills.
8b 70b1.9M Pulls 53 Tags Updated 1 year ago
-
smollm
🪐 A family of small models with 135M, 360M, and 1.7B parameters, trained on a new high-quality dataset.
135m 360m 1.7b1.8M Pulls 94 Tags Updated 1 year ago
-
dolphin-mixtral
Uncensored, 8x7b and 8x22b fine-tuned models based on the Mixtral mixture of experts models that excels at coding tasks. Created by Eric Hartford.
8x7b 8x22b1.7M Pulls 70 Tags Updated 1 year ago
-
llama4
Meta's latest collection of multimodal models.
vision tools 16x17b 128x17b1.6M Pulls 11 Tags Updated 10 months ago
-
dolphin-phi
2.7B uncensored Dolphin model by Eric Hartford, based on the Phi language model by Microsoft Research.
2.7b1.5M Pulls 15 Tags Updated 2 years ago
-
dolphin-mistral
The uncensored Dolphin model based on Mistral that excels at coding tasks. Updated to version 2.8.
7b1.4M Pulls 120 Tags Updated 2 years ago
-
magistral
Magistral is a small, efficient reasoning model with 24B parameters.
tools thinking 24b1.4M Pulls 5 Tags Updated 10 months ago
-
translategemma
A new collection of open translation models built on Gemma 3, helping people communicate across 55 languages.
vision 4b 12b 27b1.3M Pulls 13 Tags Updated 3 months ago
-
deepscaler
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
1.5b1.2M Pulls 5 Tags Updated 1 year ago
-
codestral
Codestral is Mistral AI’s first-ever code model designed for code generation tasks.
22b1.2M Pulls 17 Tags Updated 1 year ago
-
glm-4.7-flash
As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
tools thinking1.2M Pulls 4 Tags Updated 3 months ago
-
mistral-large
Mistral Large 2 is Mistral's new flagship model that is significantly more capable in code generation, mathematics, and reasoning with 128k context window and support for dozens of languages.
tools 123b1.2M Pulls 32 Tags Updated 1 year ago
-
lfm2.5-thinking
LFM2.5 is a new family of hybrid models designed for on-device deployment.
tools 1.2b1.2M Pulls 5 Tags Updated 3 months ago
-
wizardlm2
State of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases.
7b 8x22b1.1M Pulls 22 Tags Updated 2 years ago
-
glm4
A strong multi-lingual general language model with competitive performance to Llama 3.
9b1.1M Pulls 32 Tags Updated 1 year ago
-
deepseek-llm
An advanced language model crafted with 2 trillion bilingual tokens.
7b 67b1.1M Pulls 64 Tags Updated 2 years ago
-
lfm2
LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.
tools 24b1.1M Pulls 6 Tags Updated 2 months ago
-
ministral-3
The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware.
vision tools cloud 3b 8b 14b1.1M Pulls 16 Tags Updated 4 months ago
-
falcon
A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.
7b 40b 180b1.1M Pulls 38 Tags Updated 2 years ago
-
neural-chat
A fine-tuned model based on Mistral with good coverage of domain and language.
7b995K Pulls 50 Tags Updated 2 years ago
-
llama2-chinese
Llama 2 based model fine tuned to improve Chinese dialogue ability.
7b 13b995K Pulls 35 Tags Updated 2 years ago
-
stable-code
Stable Code 3B is a coding model with instruct and code completion variants on par with models such as Code Llama 7B that are 2.5x larger.
3b989.5K Pulls 36 Tags Updated 2 years ago
-
sqlcoder
SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks
7b 15b976.2K Pulls 48 Tags Updated 2 years ago
-
stablelm2
Stable LM 2 is a state-of-the-art 1.6B and 12B parameter language model trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch.
1.6b 12b955.5K Pulls 84 Tags Updated 1 year ago
-
llama3-chatqa
A model from NVIDIA based on Llama 3 that excels at conversational question answering (QA) and retrieval-augmented generation (RAG).
8b 70b951.7K Pulls 35 Tags Updated 1 year ago
-
dolphincoder
A 7B and 15B uncensored variant of the Dolphin model family that excels at coding, based on StarCoder2.
7b 15b937K Pulls 35 Tags Updated 2 years ago
-
devstral
Devstral: the best open source model for coding agents
tools 24b935.2K Pulls 5 Tags Updated 10 months ago
-
llama3-gradient
This model extends LLama-3 8B's context length from 8k to over 1m tokens.
8b 70b933.1K Pulls 35 Tags Updated 2 years ago
-
llama-guard3
Llama Guard 3 is a series of models fine-tuned for content safety classification of LLM inputs and responses.
1b 8b931K Pulls 33 Tags Updated 1 year ago
-
samantha-mistral
A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.
7b927.5K Pulls 49 Tags Updated 2 years ago
-
llama3-groq-tool-use
A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.
tools 8b 70b924.2K Pulls 33 Tags Updated 1 year ago
-
internlm2
InternLM2.5 is a 7B parameter model tailored for practical scenarios with outstanding reasoning capability.
1m 1.8b 7b 20b921.5K Pulls 65 Tags Updated 1 year ago
-
starling-lm
Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.
7b909.9K Pulls 36 Tags Updated 2 years ago
-
solar
A compact, yet powerful 10.7B large language model designed for single-turn conversation.
10.7b906.6K Pulls 32 Tags Updated 2 years ago
-
phind-codellama
Code generation model based on Code Llama.
34b904.7K Pulls 49 Tags Updated 2 years ago
-
xwinlm
Conversational model based on Llama 2 that performs competitively on various benchmarks.
7b 13b899.6K Pulls 80 Tags Updated 2 years ago
-
yarn-llama2
An extension of Llama 2 that supports a context of up to 128k tokens.
7b 13b891K Pulls 67 Tags Updated 2 years ago
-
stable-beluga
Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.
7b 13b 70b869.3K Pulls 49 Tags Updated 2 years ago
-
reader-lm
A series of models that convert HTML content to Markdown content, which is useful for content conversion tasks.
0.5b 1.5b866.3K Pulls 33 Tags Updated 1 year ago
-
shieldgemma
ShieldGemma is set of instruction tuned models for evaluating the safety of text prompt input and text output responses against a set of defined safety policies.
2b 9b 27b861.2K Pulls 49 Tags Updated 1 year ago
-
llama-pro
An expansion of Llama 2 that specializes in integrating both general language understanding and domain-specific knowledge, particularly in programming and mathematics.
853.1K Pulls 33 Tags Updated 2 years ago
-
yarn-mistral
An extension of Mistral to support context windows of 64K or 128K.
7b848K Pulls 33 Tags Updated 2 years ago
-
paraphrase-multilingual
Sentence-transformers model that can be used for tasks like clustering or semantic search.
embedding 278m841.2K Pulls 3 Tags Updated 1 year ago
-
bakllava
BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.
vision 7b840.5K Pulls 17 Tags Updated 2 years ago
-
wizardlm
General use model based on Llama 2.
822K Pulls 73 Tags Updated 2 years ago
-
devstral-small-2
24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
vision tools cloud 24b817.6K Pulls 6 Tags Updated 4 months ago
-
command-r-plus
Command R+ is a powerful, scalable large language model purpose-built to excel at real-world enterprise use cases.
tools 104b756.1K Pulls 21 Tags Updated 1 year ago
-
mistral-small3.1
Building upon Mistral Small 3, Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance.
vision tools 24b737K Pulls 5 Tags Updated 1 year ago
-
tinydolphin
An experimental 1.1B parameter model trained on the new Dolphin 2.8 dataset by Eric Hartford and based on TinyLlama.
1.1b679.8K Pulls 18 Tags Updated 2 years ago
-
mistral-openorca
Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.
7b647.7K Pulls 17 Tags Updated 2 years ago
-
wizardlm-uncensored
Uncensored version of Wizard LM model
13b606.9K Pulls 18 Tags Updated 2 years ago
-
reflection
A high-performing model trained with a new technique called Reflection-tuning that teaches a LLM to detect mistakes in its reasoning and correct course.
70b582.5K Pulls 17 Tags Updated 1 year ago
-
nous-hermes2-mixtral
The Nous Hermes 2 model from Nous Research, now trained over Mixtral.
8x7b557.8K Pulls 18 Tags Updated 1 year ago
-
megadolphin
MegaDolphin-2.2-120b is a transformation of Dolphin-2.2-70b created by interleaving the model with itself.
120b536.9K Pulls 19 Tags Updated 2 years ago
-
medllama2
Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.
7b535K Pulls 17 Tags Updated 2 years ago
-
everythinglm
Uncensored Llama2 based model with support for a 16K context window.
13b533.2K Pulls 18 Tags Updated 2 years ago
-
solar-pro
Solar Pro Preview: an advanced large language model (LLM) with 22 billion parameters designed to fit into a single GPU
22b526.3K Pulls 18 Tags Updated 1 year ago
-
mathstral
MathΣtral: a 7B model designed for math reasoning and scientific discovery by Mistral AI.
7b514.7K Pulls 17 Tags Updated 1 year ago
-
falcon2
Falcon2 is an 11B parameters causal decoder-only model built by TII and trained over 5T tokens.
11b506.1K Pulls 17 Tags Updated 1 year ago
-
stablelm-zephyr
A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.
3b502.5K Pulls 17 Tags Updated 2 years ago
-
duckdb-nsql
7B parameter text-to-SQL model made by MotherDuck and Numbers Station.
7b497.1K Pulls 17 Tags Updated 2 years ago
-
mistrallite
MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.
7b494K Pulls 17 Tags Updated 2 years ago
-
open-orca-platypus2
Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.
13b480.7K Pulls 17 Tags Updated 2 years ago
-
goliath
A language model created by combining two fine-tuned Llama 2 70B models into one.
452.4K Pulls 16 Tags Updated 2 years ago
-
glm-ocr
GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.
vision tools425.4K Pulls 3 Tags Updated 3 months ago
-
olmo-3
Olmo is a series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.
7b 32b422.1K Pulls 15 Tags Updated 4 months ago
-
snowflake-arctic-embed2
Snowflake's frontier embedding model. Arctic Embed 2.0 adds multilingual support without sacrificing English performance or scalability.
embedding 568m383.7K Pulls 3 Tags Updated 1 year ago
-
sailor2
Sailor2 are multilingual language models made for South-East Asia. Available in 1B, 8B, and 20B parameter sizes.
1b 8b 20b383.1K Pulls 13 Tags Updated 1 year ago
-
tulu3
Tülu 3 is a leading instruction following model family, offering fully open-source data, code, and recipes by the The Allen Institute for AI.
8b 70b360.1K Pulls 9 Tags Updated 1 year ago
-
glm-5
A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
tools thinking cloud341K Pulls 1 Tag Updated 2 months ago
-
gemini-3-flash-preview
Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.
vision tools thinking cloud289K Pulls 2 Tags Updated 4 months ago
-
llava-phi3
A new small LLaVA model fine-tuned from Phi 3 Mini.
vision 3.8b278.6K Pulls 4 Tags Updated 1 year ago
-
glm-5.1
GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin.
tools thinking cloud273.4K Pulls 1 Tag Updated 4 weeks ago
-
olmo-3.1
Olmo is a series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.
tools 32b268.9K Pulls 10 Tags Updated 4 months ago
-
bge-large
Embedding model from BAAI mapping texts to vectors.
embedding 335m261.6K Pulls 3 Tags Updated 1 year ago
-
glm-4.6
Advanced agentic, reasoning and coding capabilities.
tools thinking cloud249.8K Pulls 1 Tag Updated 6 months ago
-
smallthinker
A new small reasoning model fine-tuned from the Qwen 2.5 3B Instruct model.
3b242.4K Pulls 5 Tags Updated 1 year ago
-
glm-4.7
Advancing the Coding Capability
tools thinking cloud237.6K Pulls 1 Tag Updated 4 months ago
-
alfred
A robust conversational model designed to be used for both chat and instruct use cases.
40b226K Pulls 7 Tags Updated 2 years ago
-
devstral-2
123B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
tools cloud 123b211.2K Pulls 6 Tags Updated 4 months ago
-
deepseek-v4-flash
DeepSeek-V4-Flash is a preview of the DeepSeek-V4 series, a Mixture-of-Experts model with 284B total parameters and 13B activated, built for efficient reasoning across a 1M-token context window.
tools thinking cloud48.2K Pulls 1 Tag Updated 1 week ago
-
mistral-large-3
A general-purpose multimodal mixture-of-experts model for production-grade tasks and enterprise workloads.
vision tools cloud48.1K Pulls 1 Tag Updated 5 months ago
-
mistral-medium-3.5
Mistral Medium 3.5 is the first flagship model of Mistral AI that merged instruction-following, reasoning, and coding in a single set of 128B weights.
vision tools thinking 128b12.1K Pulls 5 Tags Updated 20 hours ago
-
laguna-xs.2
Laguna XS.2 is a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token designed for agentic coding and long-horizon work on a local machine.
tools thinking6,404 Pulls 7 Tags Updated 1 week ago