DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.
4,393 Pulls Updated 2 months ago
4,393 Pulls Updated 2 months ago