DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.

4,393 2 months ago

2 Tags
4376ba0a1404 • 20GB • 2 months ago
7c2ad492c728 • 35GB • 2 months ago