219 2 days ago

Qwen 3.5 9B quantized by BatiAI. 12.5 t/s on 16GB Mac. Best for tool calling.

tools thinking
ollama run batiai/qwen3.5-9b

Applications

Claude Code
Claude Code ollama launch claude --model batiai/qwen3.5-9b
Codex
Codex ollama launch codex --model batiai/qwen3.5-9b
OpenCode
OpenCode ollama launch opencode --model batiai/qwen3.5-9b
OpenClaw
OpenClaw ollama launch openclaw --model batiai/qwen3.5-9b

Models

View all →

Readme

Qwen 3.5 9B — Quantized by BatiAI

Quantized from official Alibaba weights. Verified on real Mac hardware.

Models

Tag Size 16GB Mac mini M4 M4 Max (128GB) Use Case
q4 (latest) 5.6GB 12.5 t/s 43.2 t/s 16GB Mac recommended
q6 7.4GB 4.2 t/s ⚠️ 40.8 t/s Higher quality, slower on 16GB

Quick Start

ollama run batiai/qwen3.5-9b

Why Qwen 3.5 9B?

  • Outperforms GPT-OSS-120B on MMLU-Pro benchmarks (9B vs 120B)
  • Best tool calling accuracy among open models
  • 100+ languages including excellent Korean
  • 5.6GB fits comfortably in 16GB Mac — no swap, no lag
  • Apache 2.0 license

16GB Mac — The Sweet Spot

Model Speed on 16GB Mac Size
batiai/qwen3.5-9b:q4 12.5 t/s 5.6GB
batiai/gemma4-26b:q3 0.3 t/s ❌ 13GB
gemma4:e4b (official 8B) 27.7 t/s 9.6GB

Qwen 3.5 9B Q4 delivers the best balance of intelligence and speed on 16GB Mac. Smarter than Gemma 8B, fast enough for real-time use.

Why BatiAI?

  • Quantized directly from official Alibaba weights (not third-party)
  • Verified on Mac mini M4 (16GB) + MacBook Pro M4 Max (128GB)
  • IQ3 tested and confirmed broken — Q4 is minimum viable for this model
  • Built for BatiFlow’s 57 tool functions

Built for BatiFlow

Free, on-device AI automation for Mac. 5MB app, 100% local, unlimited.

https://flow.bati.ai