87 3 days ago

Pruned to 98 experts gemma-4 a4b 26b v4

tools thinking
ollama run mannix/gemma4-98e-v4:IQ3_M

Applications

Claude Code
Claude Code ollama launch claude --model mannix/gemma4-98e-v4:IQ3_M
OpenClaw
OpenClaw ollama launch openclaw --model mannix/gemma4-98e-v4:IQ3_M
Hermes Agent
Hermes Agent ollama launch hermes --model mannix/gemma4-98e-v4:IQ3_M
Codex
Codex ollama launch codex --model mannix/gemma4-98e-v4:IQ3_M
OpenCode
OpenCode ollama launch opencode --model mannix/gemma4-98e-v4:IQ3_M

Models

View all →

29 models

gemma4-98e-v4:IQ3_M

9.8GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q2_K_L

8.6GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q3_K_XL

11GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q6_K_L

18GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q2_K

8.4GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q3_K_S

9.7GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q3_K_M

9.9GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q3_K_L

11GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q4_0

11GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q4_1

13GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q4_K_S

12GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q4_K_M

11GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q5_K_S

14GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q5_K_M

13GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q6_K

15GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:Q8_0

21GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:IQ2_XXS

7.4GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:IQ2_XS

7.8GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:IQ2_S

7.8GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:IQ2_M

8.2GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:IQ3_XXS

8.9GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:IQ3_XS

9.2GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:IQ4_XS

11GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:IQ4_NL

12GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:CD-Q2_K

8.4GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:CD-Q3_K_M

9.9GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:CD-Q4_K_M

11GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:CD-Q5_K_M

13GB · 256K context window · Text · 3 days ago

gemma4-98e-v4:CD-Q6_K

15GB · 256K context window · Text · 3 days ago

Readme

The gemma-4-A4B-98e-v4 is pruned specifically to keep general knowledge as wide as possible, unlike the v3 aimed specifically at keeping intact reasoning. The token usage is similar as the original 128e version, lower than v3 which needs 1.7x.

  HumanEval-chat token usage (164 problems × max=3072)

  ┌──────────────┬─────┬─────┬─────┬─────┬──────┬─────┐
  │   variant    │ min │ p10 │ p50 │ p90 │ max  │ avg │
  ├──────────────┼─────┼─────┼─────┼─────┼──────┼─────┤
  │ 128e @3072   │  35 │ 125 │ 314 │ 589 │  917 │ 334 │
  ├──────────────┼─────┼─────┼─────┼─────┼──────┼─────┤
  │ 98e-v4       │  35 │ 114 │ 304 │ 648 │  895 │ 340 │
  ├──────────────┼─────┼─────┼─────┼─────┼──────┼─────┤
  │ 98e-v3 @3072 │  35 │ 206 │ 490 │ 897 │ 1013 │ 512 │
  └──────────────┴─────┴─────┴─────┴─────┴──────┴─────┘

Template fixed for tools usage

Model on HF:

https://huggingface.co/ManniX-ITA/gemma-4-A4B-98e-v4-it

Full GGUF:

https://huggingface.co/ManniX-ITA/gemma-4-A4B-98e-v4-it-GGUF