25.2K Pulls 4 Tags Updated 4 months ago
DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.
5,345 Pulls 2 Tags Updated 7 months ago
DeepSeek-R1-Distill-Qwen-1.5B
3,170 Pulls 1 Tag Updated 7 months ago
Additional training on Japanese data by CyberAgent for deepseek-r1.
2,795 Pulls 2 Tags Updated 7 months ago
2,290 Pulls 1 Tag Updated 7 months ago
DeepSeek-R1-Distill-Qwen-7B
2,128 Pulls 1 Tag Updated 7 months ago
2,075 Pulls 1 Tag Updated 7 months ago
DeepSeek-R1-Distill-Qwen-14B
1,527 Pulls 1 Tag Updated 7 months ago
Model from https://huggingface.co/neody/DeepSeek-R1-Distill-Qwen-7B-gguf/tree/main
1,406 Pulls 1 Tag Updated 7 months ago
1,029 Pulls 1 Tag Updated 7 months ago
723 Pulls 3 Tags Updated 7 months ago
DeepSeek-R1-Distill-Qwen-Coder-32B-Fusion-9010 is a mixed model that combines the strengths of two powerful DeepSeek-R1-Distill-Qwen-based models: huihui-ai/DeepSeek-R1-Distill-Qwen-32B-abliterated and huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated.
601 Pulls 6 Tags Updated 6 months ago
DeepSeek R1 Distilled model to one-fourth its original file size—without losing any accuracy.
534 Pulls 1 Tag Updated 6 months ago
Optimized for 38 languages
493 Pulls 1 Tag Updated 7 months ago
477 Pulls 1 Tag Updated 7 months ago
361 Pulls 1 Tag Updated 7 months ago
320 Pulls 5 Tags Updated 4 months ago
238 Pulls 1 Tag Updated 7 months ago
234 Pulls 1 Tag Updated 7 months ago
Deepseek-r1 distilled 14b Qwen model, Q3_K_M quantized version
230 Pulls 1 Tag Updated 6 months ago