batiai/ qwen3.5-9b:q4

225 Downloads Updated 2 days ago

Qwen 3.5 9B quantized by BatiAI. 12.5 t/s on 16GB Mac. Best for tool calling.

tools thinking

ollama run batiai/qwen3.5-9b:q4

curl http://localhost:11434/api/chat \
  -d '{
    "model": "batiai/qwen3.5-9b:q4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='batiai/qwen3.5-9b:q4',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'batiai/qwen3.5-9b:q4',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 2 days ago

2 days ago

ee545479fa26 · 5.6GB ·

model

archqwen35

·

parameters8.95B

·

quantizationQ4_K_M

5.6GB

template

{{ .Prompt }}

13B

params

{ "presence_penalty": 1.5, "temperature": 1, "top_k": 20, "top_p": 0.95 }

65B

Readme

Qwen 3.5 9B — Quantized by BatiAI

Quantized from official Alibaba weights. Verified on real Mac hardware.

Models

Tag	Size	16GB Mac mini M4	M4 Max (128GB)	Use Case
q4 (latest)	5.6GB	12.5 t/s ✅	43.2 t/s	16GB Mac recommended
q6	7.4GB	4.2 t/s ⚠️	40.8 t/s	Higher quality, slower on 16GB

Quick Start

ollama run batiai/qwen3.5-9b

Why Qwen 3.5 9B?

Outperforms GPT-OSS-120B on MMLU-Pro benchmarks (9B vs 120B)
Best tool calling accuracy among open models
100+ languages including excellent Korean
5.6GB fits comfortably in 16GB Mac — no swap, no lag
Apache 2.0 license

16GB Mac — The Sweet Spot

Model	Speed on 16GB Mac	Size
batiai/qwen3.5-9b:q4	12.5 t/s ✅	5.6GB
batiai/gemma4-26b:q3	0.3 t/s ❌	13GB
gemma4:e4b (official 8B)	27.7 t/s	9.6GB

Qwen 3.5 9B Q4 delivers the best balance of intelligence and speed on 16GB Mac. Smarter than Gemma 8B, fast enough for real-time use.

Why BatiAI?

Quantized directly from official Alibaba weights (not third-party)
Verified on Mac mini M4 (16GB) + MacBook Pro M4 Max (128GB)
IQ3 tested and confirmed broken — Q4 is minimum viable for this model
Built for BatiFlow’s 57 tool functions

Built for BatiFlow

Free, on-device AI automation for Mac. 5MB app, 100% local, unlimited.

https://flow.bati.ai

# Qwen 3.5 9B — Quantized by BatiAI

Quantized from official Alibaba weights. Verified on real Mac hardware.

## Models

| Tag | Size | 16GB Mac mini M4 | M4 Max (128GB) | Use Case |
|-----|------|-------------------|----------------|----------|
| **q4** (latest) | **5.6GB** | **12.5 t/s** ✅ | 43.2 t/s | **16GB Mac recommended** |
| q6 | 7.4GB | 4.2 t/s ⚠️ | 40.8 t/s | Higher quality, slower on 16GB |

## Quick Start

```
ollama run batiai/qwen3.5-9b
```

## Why Qwen 3.5 9B?

- Outperforms GPT-OSS-120B on MMLU-Pro benchmarks (9B vs 120B)
- Best tool calling accuracy among open models
- 100+ languages including excellent Korean
- 5.6GB fits comfortably in 16GB Mac — no swap, no lag
- Apache 2.0 license

## 16GB Mac — The Sweet Spot

| Model | Speed on 16GB Mac | Size |
|-------|-------------------|------|
| **batiai/qwen3.5-9b:q4** | **12.5 t/s** ✅ | 5.6GB |
| batiai/gemma4-26b:q3 | 0.3 t/s ❌ | 13GB |
| gemma4:e4b (official 8B) | 27.7 t/s | 9.6GB |

Qwen 3.5 9B Q4 delivers the best balance of intelligence and speed on 16GB Mac. Smarter than Gemma 8B, fast enough for real-time use.

## Why BatiAI?

- Quantized directly from official Alibaba weights (not third-party)
- Verified on Mac mini M4 (16GB) + MacBook Pro M4 Max (128GB)
- IQ3 tested and confirmed broken — Q4 is minimum viable for this model
- Built for BatiFlow's 57 tool functions

## Built for BatiFlow

Free, on-device AI automation for Mac. 5MB app, 100% local, unlimited.

https://flow.bati.ai

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)