DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.

554 2 months ago

3 Tags
0b78701f73a3 • 30GB • 2 months ago
ac8082c49df2 • 9.0GB • 2 months ago
ed6b8c52bf98 • 16GB • 2 months ago