Japanese instruction-tuned LLM by CyberAgent, distilled from Qwen-72B.
241 Pulls 1 Tag Updated 10 months ago
DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.
147.3K Pulls 2 Tags Updated 1 year ago
2,705 Pulls 1 Tag Updated 1 year ago
657 Pulls 1 Tag Updated 1 year ago
128 Pulls 1 Tag Updated 1 year ago
69 Pulls 1 Tag Updated 1 year ago
28 Pulls 2 Tags Updated 1 year ago
DeepSeek-R1-Distill-Qwen-Coder-32B-Fusion-9010 is a mixed model that combines the strengths of two powerful DeepSeek-R1-Distill-Qwen-based models: huihui-ai/DeepSeek-R1-Distill-Qwen-32B-abliterated and huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated.
2,250 Pulls 6 Tags Updated 1 year ago