Qihoo 360's first-generation reasoning model, Tiny-R1-32B-Preview, which outperforms the 70B model Deepseek-R1-Distill-Llama-70B and nearly matches the full R1 model in math.

32b

45 5 weeks ago

f875de4b2988 · 130B
{
"repeat_penalty": 1.25,
"stop": [
"<|end▁of▁sentence|>",
"<|endoftext|>"
],
"temperature": 0.6,
"top_p": 0.95
}