DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
74.5M Pulls 35 Tags Updated 5 months ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
3M Pulls 5 Tags Updated 11 months ago
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
2.3M Pulls 102 Tags Updated 1 year ago
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
1.3M Pulls 64 Tags Updated 1 year ago
An advanced language model crafted with 2 trillion bilingual tokens.
237.6K Pulls 64 Tags Updated 2 years ago
A strong, economical, and efficient Mixture-of-Experts language model.
227.2K Pulls 34 Tags Updated 1 year ago
DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.
207.9K Pulls 8 Tags Updated 2 months ago
An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
92.4K Pulls 7 Tags Updated 1 year ago
DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
65.5K Pulls 3 Tags Updated 4 weeks ago
DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
4,806 Pulls 1 Tag Updated 1 week ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
870.6K Pulls 5 Tags Updated 10 months ago
A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
633.3K Pulls 15 Tags Updated 8 months ago
A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
156.7K Pulls 9 Tags Updated 10 months ago
5,662 Pulls 1 Tag Updated 10 months ago
DeepSeek's first generation reasoning models with comparable performance to OpenAI-o1.
609.1K Pulls 55 Tags Updated 6 months ago
This is a modified model that adds support for autonomous coding agents like Cline
556.1K Pulls 6 Tags Updated 9 months ago
Unsloth's DeepSeek-R1 1.58-bit, I just merged the thing and uploaded it here. This is the full 671b model, albeit dynamically quantized to 1.58bits.
101.4K Pulls 1 Tag Updated 10 months ago
Merged GGUF Unsloth's DeepSeek-R1 671B 2.51bit dynamic quant
60.5K Pulls 1 Tag Updated 10 months ago
Merged GGUF Unsloth's DeepSeek-R1 671B 1.73bit dynamic quant
26.7K Pulls 1 Tag Updated 10 months ago
DeepSeek's first-generation of reasoning models with comparable performance to OpenAI-o1, including six dense models distilled from DeepSeek-R1 based on Llama and Qwen. With Tool Calling support.
26.4K Pulls 26 Tags Updated 10 months ago