45 Downloads Updated 4 days ago
ollama run mannix/gemma4-98e:Q4_0
Updated 4 days ago
4 days ago
1f6ac5c04354 · 11GB ·
The gemma-4-A4B-98e-v3 is pruned specifically to keep intact reasoning, the token usage is higher than the original 128e version. It’s superseded by the v4 version (https://ollama.com/mannix/gemma4-98e-v4) that scores better and is within the same 128e original token usage:
HumanEval-chat token usage (164 problems × max=3072)
┌──────────────┬─────┬─────┬─────┬─────┬──────┬─────┐
│ variant │ min │ p10 │ p50 │ p90 │ max │ avg │
├──────────────┼─────┼─────┼─────┼─────┼──────┼─────┤
│ 128e @3072 │ 35 │ 125 │ 314 │ 589 │ 917 │ 334 │
├──────────────┼─────┼─────┼─────┼─────┼──────┼─────┤
│ 98e-v4 │ 35 │ 114 │ 304 │ 648 │ 895 │ 340 │
├──────────────┼─────┼─────┼─────┼─────┼──────┼─────┤
│ 98e-v3 @3072 │ 35 │ 206 │ 490 │ 897 │ 1013 │ 512 │
└──────────────┴─────┴─────┴─────┴─────┴──────┴─────┘
Template fixed for tools usage
Model on HF:
https://huggingface.co/ManniX-ITA/gemma-4-A4B-98e-v3-it
Full GGUF:
https://huggingface.co/ManniX-ITA/gemma-4-A4B-98e-v3-it-GGUF