14 3 hours ago

Ultra-compressed Gemma 4 models (Q3_K_S quantization) optimized for mobile, edge, and resource-constrained environments. 50-57% smaller than stock models with 13% faster inference.

tools thinking e2b e4b 26b 31b
0db043106dc7 · 118B
{
"num_batch": 512,
"num_ctx": 16384,
"num_thread": 8,
"stop": [
"<turn|>"
],
"temperature": 1,
"top_k": 64,
"top_p": 0.95
}