558 2 months ago

Gemma 4 distilled from claude opus 4.6 thinking. Has only a 5% gap with claude opus 4.6 thinking while being over 40x smaller. Designed for server inference. Designed for local inference

tools thinking
{
"num_ctx": 256000,
"stop": [
"<turn|>"
],
"temperature": 0.7
}