Japanese instruction-tuned LLM by CyberAgent, distilled from Qwen-72B.
275 Pulls 1 Tag Updated 12 months ago
DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.
147.8K Pulls 2 Tags Updated 1 year ago
2,929 Pulls 1 Tag Updated 1 year ago
717 Pulls 1 Tag Updated 1 year ago
133 Pulls 1 Tag Updated 1 year ago
75 Pulls 1 Tag Updated 1 year ago
33 Pulls 2 Tags Updated 1 year ago
DeepSeek-R1-Distill-Qwen-Coder-32B-Fusion-9010 is a mixed model that combines the strengths of two powerful DeepSeek-R1-Distill-Qwen-based models: huihui-ai/DeepSeek-R1-Distill-Qwen-32B-abliterated and huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated.
2,430 Pulls 6 Tags Updated 1 year ago