79 8 months ago

Reasoning model distilled from DeepSeek-R1, enhanced with GRPO using supplementary reasoning datasets.

14b

Models

View all →

Readme

dna-r1-logo.png

We introduce DNA-R1, a specialized reasoning model optimized for Korean language based on Microsoft’s Phi-4. By applying large-scale reinforcement learning (RL) using the same methodology as DeepSeek-R1, we have significantly enhanced the model’s Korean reasoning capabilities. This model demonstrates deep understanding of Korean text and exhibits exceptional reasoning abilities across mathematics, coding, and general reasoning tasks.

dna-r1-pipeline.png

References