DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
74.5M Pulls 35 Tags Updated 5 months ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
869.6K Pulls 5 Tags Updated 10 months ago
A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
633.1K Pulls 15 Tags Updated 8 months ago
A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
156.6K Pulls 9 Tags Updated 10 months ago
This version of Deepseek R1 is optimized for tool usage with Cline and Roo Code.
17.1K Pulls 510 Tags Updated 10 months ago
Deepseek R1 with the Claude 3.7 Sonnet system prompt. Inspired by incept5/llama3.1-claude
5,018 Pulls 1 Tag Updated 9 months ago
Deepseek R1 optimized for tool usage with Cline.
1,661 Pulls 3 Tags Updated 9 months ago
Tiny-R1-32B-Preview, which outperforms the 70B model Deepseek-R1-Distill-Llama-70B and nearly matches the full R1 model in math.
1,321 Pulls 6 Tags Updated 9 months ago
1,021 Pulls 3 Tags Updated 10 months ago
Deepseek R1 with the Claude 3.5 Sonnet system prompt. Inspired by incept5/llama3.1-claude
794 Pulls 1 Tag Updated 10 months ago
(Mostly) Uncensored Deepseek R1 based on unsloth/deepseek-r1-distill-qwen-7b-unsloth-bnb-4bit.
604 Pulls 1 Tag Updated 10 months ago
Deepseek R1 ablated version from mradermacher's gguf
461 Pulls 1 Tag Updated 10 months ago
Qihoo 360's first-generation reasoning model, Tiny-R1-32B-Preview, which outperforms the 70B model Deepseek-R1-Distill-Llama-70B and nearly matches the full R1 model in math.
166 Pulls 1 Tag Updated 9 months ago
DeepCoder-14B-Preview, a code reasoning model finetuned from Deepseek-R1-Distilled-Qwen-14B via distributed RL
90 Pulls 1 Tag Updated 7 months ago
21 Pulls 1 Tag Updated 10 months ago
Unsloth's DeepSeek-R1 , I just merged the thing and uploaded it here. This is the full 671b model. MoE Bits:1.58bit Type:UD-IQ1_S Disk Size:131GB Accuracy:Fair Details:MoE all 1.56bit. down_proj in MoE mixture of 2.06/1.56bit
170.9K Pulls 2 Tags Updated 10 months ago
DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.
125.7K Pulls 2 Tags Updated 11 months ago
Unsloth's DeepSeek-R1 1.58-bit, I just merged the thing and uploaded it here. This is the full 671b model, albeit dynamically quantized to 1.58bits.
101.4K Pulls 1 Tag Updated 10 months ago
Merged GGUF Unsloth's DeepSeek-R1 671B 2.51bit dynamic quant
60.5K Pulls 1 Tag Updated 10 months ago
29.2K Pulls 4 Tags Updated 7 months ago