DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
74.5M Pulls 35 Tags Updated 5 months ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
3M Pulls 5 Tags Updated 11 months ago
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
2.3M Pulls 102 Tags Updated 1 year ago
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
1.3M Pulls 64 Tags Updated 1 year ago
An advanced language model crafted with 2 trillion bilingual tokens.
237.6K Pulls 64 Tags Updated 2 years ago
A strong, economical, and efficient Mixture-of-Experts language model.
227.2K Pulls 34 Tags Updated 1 year ago
DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.
207.9K Pulls 8 Tags Updated 2 months ago
An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
92.4K Pulls 7 Tags Updated 1 year ago
DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
65.5K Pulls 3 Tags Updated 4 weeks ago
DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
4,811 Pulls 1 Tag Updated 1 week ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
870.6K Pulls 5 Tags Updated 10 months ago
A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
633.3K Pulls 15 Tags Updated 8 months ago
A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
156.7K Pulls 9 Tags Updated 10 months ago
5,662 Pulls 1 Tag Updated 10 months ago
This version of Deepseek R1 is optimized for tool usage with Cline and Roo Code.
17.1K Pulls 510 Tags Updated 10 months ago
Deepseek R1 with the Claude 3.7 Sonnet system prompt. Inspired by incept5/llama3.1-claude
5,018 Pulls 1 Tag Updated 9 months ago
Deepseek R1 optimized for tool usage with Cline.
1,661 Pulls 3 Tags Updated 9 months ago
Tiny-R1-32B-Preview, which outperforms the 70B model Deepseek-R1-Distill-Llama-70B and nearly matches the full R1 model in math.
1,321 Pulls 6 Tags Updated 9 months ago
1,021 Pulls 3 Tags Updated 10 months ago
This model is a distilled version of Qwen/Qwen3-30B-A3B-Instruct designed to inherit the reasoning and behavioral characteristics of its much larger teacher model, deepseek-ai/DeepSeek-V3.1.
859 Pulls 2 Tags Updated 3 months ago