DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
73.6M Pulls 35 Tags Updated 5 months ago
DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.
183.4K Pulls 8 Tags Updated 2 months ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
2.9M Pulls 5 Tags Updated 10 months ago
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
2.1M Pulls 102 Tags Updated 1 year ago
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
1.2M Pulls 64 Tags Updated 1 year ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
784.2K Pulls 5 Tags Updated 9 months ago
DeepCoder is a fully open-Source 14B coder model at O3-mini level, with a 1.5B version also available.
424.8K Pulls 9 Tags Updated 8 months ago
An advanced language model crafted with 2 trillion bilingual tokens.
228.6K Pulls 64 Tags Updated 1 year ago
A strong, economical, and efficient Mixture-of-Experts language model.
216.9K Pulls 34 Tags Updated 1 year ago
An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
83.9K Pulls 7 Tags Updated 1 year ago
DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
44.4K Pulls 3 Tags Updated 2 weeks ago
Gemma is a family of lightweight, state-of-the-art open models built by Google DeepMind. Updated to version 1.1
5.6M Pulls 102 Tags Updated 1 year ago
Cogito v1 Preview is a family of hybrid reasoning models by Deep Cogito that outperform the best available open models of the same size, including counterparts from LLaMA, DeepSeek, and Qwen across most standard benchmarks.
845.7K Pulls 20 Tags Updated 8 months ago
A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
619.7K Pulls 15 Tags Updated 8 months ago
EXAONE Deep exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research.
289.3K Pulls 13 Tags Updated 8 months ago
A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
145.1K Pulls 9 Tags Updated 9 months ago
Gemma 3n models are designed for efficient execution on everyday devices such as laptops, tablets or phones.
857K Pulls 9 Tags Updated 5 months ago
Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta's Llama 2 models. The model is designed to excel particularly in reasoning.
94.7K Pulls 33 Tags Updated 2 years ago
The smallest model in Cohere's R series delivers top-tier speed, efficiency, and quality to build powerful AI applications on commodity GPUs and edge devices.
91K Pulls 5 Tags Updated 10 months ago
The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware.
32.6K Pulls 16 Tags Updated 2 days ago