DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.
123.7K Pulls 2 Tags Updated 11 months ago
2,359 Pulls 1 Tag Updated 10 months ago
513 Pulls 1 Tag Updated 10 months ago
Japanese instruction-tuned LLM by CyberAgent, distilled from Qwen-72B.
182 Pulls 1 Tag Updated 7 months ago
113 Pulls 1 Tag Updated 10 months ago
68 Pulls 1 Tag Updated 10 months ago
20 Pulls 2 Tags Updated 10 months ago
DeepSeek-R1-Distill-Qwen-Coder-32B-Fusion-9010 is a mixed model that combines the strengths of two powerful DeepSeek-R1-Distill-Qwen-based models: huihui-ai/DeepSeek-R1-Distill-Qwen-32B-abliterated and huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated.
1,075 Pulls 6 Tags Updated 9 months ago