698 3 weeks ago

3B model that shouldn't be this good - crushes benchmarks through deep chain-of-thought reasoning

684b31d25508 · 35B
{
"num_ctx": 8192,
"temperature": 0.7
}