225 2 days ago

Qwen 3.5 9B quantized by BatiAI. 12.5 t/s on 16GB Mac. Best for tool calling.

tools thinking
ollama run batiai/qwen3.5-9b:q4

Details

2 days ago

ee545479fa26 · 5.6GB ·

qwen35
·
8.95B
·
Q4_K_M
{{ .Prompt }}
{ "presence_penalty": 1.5, "temperature": 1, "top_k": 20, "top_p": 0.95 }

Readme

Qwen 3.5 9B — Quantized by BatiAI

Quantized from official Alibaba weights. Verified on real Mac hardware.

Models

Tag Size 16GB Mac mini M4 M4 Max (128GB) Use Case
q4 (latest) 5.6GB 12.5 t/s 43.2 t/s 16GB Mac recommended
q6 7.4GB 4.2 t/s ⚠️ 40.8 t/s Higher quality, slower on 16GB

Quick Start

ollama run batiai/qwen3.5-9b

Why Qwen 3.5 9B?

  • Outperforms GPT-OSS-120B on MMLU-Pro benchmarks (9B vs 120B)
  • Best tool calling accuracy among open models
  • 100+ languages including excellent Korean
  • 5.6GB fits comfortably in 16GB Mac — no swap, no lag
  • Apache 2.0 license

16GB Mac — The Sweet Spot

Model Speed on 16GB Mac Size
batiai/qwen3.5-9b:q4 12.5 t/s 5.6GB
batiai/gemma4-26b:q3 0.3 t/s ❌ 13GB
gemma4:e4b (official 8B) 27.7 t/s 9.6GB

Qwen 3.5 9B Q4 delivers the best balance of intelligence and speed on 16GB Mac. Smarter than Gemma 8B, fast enough for real-time use.

Why BatiAI?

  • Quantized directly from official Alibaba weights (not third-party)
  • Verified on Mac mini M4 (16GB) + MacBook Pro M4 Max (128GB)
  • IQ3 tested and confirmed broken — Q4 is minimum viable for this model
  • Built for BatiFlow’s 57 tool functions

Built for BatiFlow

Free, on-device AI automation for Mac. 5MB app, 100% local, unlimited.

https://flow.bati.ai