36 3 days ago

Fine-tuned Qwen3.5-9B with distilled reasoning from research-backed datasets. Trained via LoRA SFT with an additive data strategy that preserves base model capabilities while improving instruction following and reasoning.

tools thinking
31879571b27f · 65B
{
"stop": [
"<|im_end|>"
],
"temperature": 0.6,
"top_p": 0.95
}