DeepScaleR-1.5B-Preview is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B using distributed reinforcement learning (RL)

65 7 weeks ago

3df9ae758182 · 68B
{
"num_ctx": 131072,
"stop": [
"<|end▁of▁sentence|>"
]
}