DeepSeek's first-generation of reasoning models with comparable performance to OpenAI-o1, including six dense models distilled from DeepSeek-R1 based on Llama and Qwen.
30.5M Pulls 29 Tags Updated 6 weeks ago
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
624.8K Pulls 102 Tags Updated 15 months ago
A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
514.6K Pulls 9 Tags Updated 6 weeks ago
An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
50.9K Pulls 7 Tags Updated 6 months ago
A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
19.5K Pulls 9 Tags Updated 4 weeks ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
883K Pulls 5 Tags Updated 2 months ago
1.4M Pulls 1 Tag Updated 7 weeks ago
DeepSeek's first generation reasoning models with comparable performance to OpenAI-o1.
437.5K Pulls 46 Tags Updated 7 weeks ago
Unsloth's DeepSeek-R1 , I just merged the thing and uploaded it here. This is the full 671b model. MoE Bits:1.58bit Type:UD-IQ1_S Disk Size:131GB Accuracy:Fair Details:MoE all 1.56bit. down_proj in MoE mixture of 2.06/1.56bit
170.3K Pulls 2 Tags Updated 7 weeks ago
Unsloth's DeepSeek-R1 1.58-bit, I just merged the thing and uploaded it here. This is the full 671b model, albeit dynamically quantized to 1.58bits.
99.9K Pulls 1 Tag Updated 8 weeks ago
Merged GGUF Unsloth's DeepSeek-R1 671B 2.51bit dynamic quant
59.7K Pulls 1 Tag Updated 8 weeks ago
Merged GGUF Unsloth's DeepSeek-R1 671B 1.73bit dynamic quant
26.5K Pulls 1 Tag Updated 8 weeks ago
18.3K Pulls 1 Tag Updated 2 months ago
DeepSeek-V3 from Huggingface: Your powerful solution for handling complex requests and advanced coding tasks. Enhance your development workflow with state-of-the-art code assistance and intelligent problem-solving capabilities.
16K Pulls 1 Tag Updated 2 months ago
DeepSeek's first-generation of reasoning models with comparable performance to OpenAI-o1, including six dense models distilled from DeepSeek-R1 based on Llama and Qwen. With Tool Calling support.
14.7K Pulls 26 Tags Updated 7 weeks ago
deepseek-r1:32b first generation reasoning models with comparable performance to OpenAI-o1.
14.5K Pulls 1 Tag Updated 8 weeks ago
Merged GGUF Unsloth's DeepSeek-R1 671B 2.22bit dynamic quant
5,744 Pulls 1 Tag Updated 8 weeks ago
5,432 Pulls 1 Tag Updated 8 weeks ago
Ollama models of DeepSeek Janus Pro 7B
4,253 Pulls 11 Tags Updated 7 weeks ago
Unsloth's DeepSeek-R1 , I just merged the thing and uploaded it here. This is the full 671b model. MoE Bits:1.73bit Type:UD-IQ1_M Disk Size:158GB Accuracy:Good Details:MoE all 1.56bit. down_proj in MoE left at 2.06bit
4,240 Pulls 2 Tags Updated 7 weeks ago