
-
light-r1
The first open-source successful RL attempt on already long-COT finetuned models of simialr sizes under light budget. Light-R1-14B is also the State-Of-The-Art 14B math model with AIME24 & 25 scores 74.0 & 60.2, outperforming many 32B models.
7b 14b 32b2,476 Pulls 8 Tags Updated 2 weeks ago
-
tiny-r1
Qihoo 360's first-generation reasoning model, Tiny-R1-32B-Preview, which outperforms the 70B model Deepseek-R1-Distill-Llama-70B and nearly matches the full R1 model in math.
32b37 Pulls 1 Tag Updated 4 weeks ago