DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
67.1M Pulls 35 Tags Updated 3 months ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
533.1K Pulls 5 Tags Updated 8 months ago
A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
603.4K Pulls 15 Tags Updated 6 months ago
A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
119.6K Pulls 9 Tags Updated 8 months ago
This version of Deepseek R1 is optimized for tool usage with Cline and Roo Code.
16.3K Pulls 510 Tags Updated 8 months ago
Deepseek R1 with the Claude 3.7 Sonnet system prompt. Inspired by incept5/llama3.1-claude
4,770 Pulls 1 Tag Updated 7 months ago
Deepseek R1 optimized for tool usage with Cline.
1,565 Pulls 3 Tags Updated 7 months ago
Tiny-R1-32B-Preview, which outperforms the 70B model Deepseek-R1-Distill-Llama-70B and nearly matches the full R1 model in math.
1,281 Pulls 6 Tags Updated 7 months ago
892 Pulls 3 Tags Updated 8 months ago
Deepseek R1 with the Claude 3.5 Sonnet system prompt. Inspired by incept5/llama3.1-claude
771 Pulls 1 Tag Updated 8 months ago
(Mostly) Uncensored Deepseek R1 based on unsloth/deepseek-r1-distill-qwen-7b-unsloth-bnb-4bit.
585 Pulls 1 Tag Updated 8 months ago
Deepseek R1 ablated version from mradermacher's gguf
452 Pulls 1 Tag Updated 8 months ago
Qihoo 360's first-generation reasoning model, Tiny-R1-32B-Preview, which outperforms the 70B model Deepseek-R1-Distill-Llama-70B and nearly matches the full R1 model in math.
137 Pulls 1 Tag Updated 7 months ago
DeepCoder-14B-Preview, a code reasoning model finetuned from Deepseek-R1-Distilled-Qwen-14B via distributed RL
82 Pulls 1 Tag Updated 5 months ago
20 Pulls 1 Tag Updated 8 months ago
Unsloth's DeepSeek-R1 , I just merged the thing and uploaded it here. This is the full 671b model. MoE Bits:1.58bit Type:UD-IQ1_S Disk Size:131GB Accuracy:Fair Details:MoE all 1.56bit. down_proj in MoE mixture of 2.06/1.56bit
170.9K Pulls 2 Tags Updated 8 months ago
Unsloth's DeepSeek-R1 1.58-bit, I just merged the thing and uploaded it here. This is the full 671b model, albeit dynamically quantized to 1.58bits.
101.2K Pulls 1 Tag Updated 8 months ago
Merged GGUF Unsloth's DeepSeek-R1 671B 2.51bit dynamic quant
60.4K Pulls 1 Tag Updated 8 months ago
DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.
45.9K Pulls 2 Tags Updated 9 months ago
26.9K Pulls 4 Tags Updated 5 months ago