284 Downloads Updated yesterday
ollama run batiai/gemma4-26b:q3
Quantized from official Google weights. Verified on real Mac hardware.
| Tag | Size | VRAM | M4 Max (128GB) | Use Case |
|---|---|---|---|---|
| iq4 | 13GB | 22GB | 85.8 t/s | 32GB Mac, fastest 4-bit |
| iq3 | 12GB | 19GB | 77 t/s | 24GB Mac, imatrix optimized |
| q3 (latest) | 13GB | 20GB | 70.7 t/s | 24GB Mac, standard |
| q4 | 16GB | 23GB | 74.9 t/s | 32GB+ Mac |
| q6 | 21GB | 31GB | 74.8 t/s | 36GB+ Mac, highest quality |
ollama run batiai/gemma4-26b:iq4
IQ4 uses importance-matrix quantization: calibration data tells which weights matter most, compressing aggressively where it doesn’t matter.
| IQ4_XS (new) | Q4_K_M (standard) | |
|---|---|---|
| Size | 13GB | 16GB |
| VRAM | 22GB | 23GB |
| Speed | 85.8 t/s | 74.9 t/s |
| Quality | 4-bit imatrix | 4-bit standard |
Same 4-bit quality, 3GB smaller file, 15% faster. Verified with translation, tool calling, and math reasoning — identical output quality.
| Your Mac RAM | IQ3 (12GB) | IQ4 (13GB) | Q3 (13GB) | Q4 (16GB) | Q6 (21GB) |
|---|---|---|---|---|---|
| 16GB | ❌ swap | ❌ swap | ❌ swap | ❌ Won’t fit | ❌ Won’t fit |
| 24GB | ✅ Fast | ✅ Fits | ⚠️ Tight | ❌ Barely | ❌ No |
| 32GB | ✅ Fast | ✅ Fast | ✅ Fast | ✅ OK | ❌ No |
| 36GB+ | ✅ Fast | ✅ Fast | ✅ Fast | ✅ Fast | ✅ Fits |
| 128GB | 77 t/s | 85.8 t/s | 70.7 t/s | 74.9 t/s | 74.8 t/s |
26B models don’t work on 16GB Mac. Use these instead:
ollama run batiai/gemma4-e4b # 57.1 t/s on 16GB Mac ✅
ollama run batiai/qwen3.5-9b # 12.5 t/s on 16GB Mac ✅
Free, on-device AI automation for Mac. 5MB app, 100% local, unlimited.