DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
141.5K Pulls 3 Tags Updated 2 months ago
DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.
308.4K Pulls 8 Tags Updated 4 months ago
DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
22.8K Pulls 1 Tag Updated 1 month ago
A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
742.5K Pulls 15 Tags Updated 10 months ago
A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
217K Pulls 9 Tags Updated 11 months ago
DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
77.4M Pulls 35 Tags Updated 7 months ago
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
2.9M Pulls 102 Tags Updated 2 years ago
An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
132.8K Pulls 7 Tags Updated 1 year ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
3.3M Pulls 5 Tags Updated 1 year ago
NovaForge AI – DeepSeek Coder 6.7B Pro is a professional-grade coding AI built for production-level development.
594 Pulls 1 Tag Updated 4 weeks ago
Huggingface link - https://huggingface.co/iradukunda-dev/law-finetuned-DeepSeek-R1-Distill-Qwen-7B
73 Pulls 1 Tag Updated 4 weeks ago
This is a brand new Mixture of Export (MoE) model from DeepSeek, specializing in coding instructions. (quantized IQ4_XS)
1,337 Pulls 3 Tags Updated 2 weeks ago
Lexa-Rho is part of Lexa Family with performance approaching that of leading models, such as GPT-5 Thinking, Gemini 2.5 Pro and DeepSeek-R1.
305 Pulls 2 Tags Updated 5 months ago
Unsloth's DeepSeek-R1 , I just merged the thing and uploaded it here. This is the full 671b model. MoE Bits:1.58bit Type:UD-IQ1_S Disk Size:131GB Accuracy:Fair Details:MoE all 1.56bit. down_proj in MoE mixture of 2.06/1.56bit
171K Pulls 2 Tags Updated 12 months ago
基于 DeepSeek-R1-Distill-Qwen-1.5B 微调的中文轻量对话模型,自带猫娘口癖与亲昵风格。
124 Pulls 1 Tag Updated 3 months ago
This is not the ablation version. DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode.
106 Pulls 3 Tags Updated 5 months ago
Ollama models of DeepSeek Janus Pro 7B
5,550 Pulls 11 Tags Updated 12 months ago
Vision Encoder for Janus Pro 7B. This model is under testing
4,750 Pulls 1 Tag Updated 12 months ago
DeepSeek-R1-0528-Qwen3-8B-IQ4_NL
3,169 Pulls 1 Tag Updated 8 months ago
Unsloth's DeepSeek-R1 , I just merged the thing and uploaded it here. This is the full 671b model. MoE Bits:1.73bit Type:UD-IQ1_M Disk Size:158GB Accuracy:Good Details:MoE all 1.56bit. down_proj in MoE left at 2.06bit
4,323 Pulls 2 Tags Updated 12 months ago