17 Downloads Updated 2 days ago
ollama run batiai/llama4-scout:iq3
Name
6 models
llama4-scout:iq3
42GB · 10M context window · Text · 2 days ago
llama4-scout:iq4
58GB · 10M context window · Text · 2 days ago
llama4-scout:q3
52GB · 10M context window · Text · 2 days ago
llama4-scout:q4
65GB · 10M context window · Text · 2 days ago
llama4-scout:q5
77GB · 10M context window · Text · 2 days ago
llama4-scout:q6
88GB · 10M context window · Text · 2 days ago
Meta’s agentic multimodal MoE in a 109B/17B-active package. imatrix-calibrated GGUF quantizations of the official meta-llama/Llama-4-Scout-17B-16E-Instruct (Llama 4 Community License), released 2025-04 by Meta. Free, unlimited, on-device AI for Mac via BatiFlow.
Multimodal-capable (vision) via separate mmproj on Hugging Face. Ollama ships text-only.
| Tag | Size | Min RAM | Use case |
|---|---|---|---|
:iq3 |
41 GB | 48 GB | Smallest footprint — M4 Max 64GB OK |
:q3 |
51 GB | 56 GB | K-quant alt for iq3 |
:iq4 |
57 GB | 64 GB | Best size/quality ratio |
:q4 |
65 GB | 72 GB | Recommended for M4 Max 128GB users |
:q5 |
76 GB | 88 GB | Higher fidelity |
:q6 |
88 GB | 96 GB | Near-original quality |
All 6 are imatrix-calibrated (wikitext-2-raw). Llama 4 Scout supports tools + extended context.
ollama pull batiai/llama4-scout:q4
ollama run batiai/llama4-scout:q4
Meta’s first Llama 4 family release. Mixture-of-Experts (16 experts × ~6B each) with single-expert routing — 17B active per token for inference efficiency.
| Your Mac | :iq3 41G |
:q3 51G |
:iq4 57G |
:q4 65G |
:q5 76G |
:q6 88G |
|---|---|---|---|---|---|---|
| 16 GB | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| 32 GB | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| 64 GB | ✅ tight | ❌ | ❌ | ❌ | ❌ | ❌ |
| 96 GB | ✅ | ✅ tight | ✅ tight | ❌ | ❌ | ❌ |
| 128 GB (M4 Max) | ✅ | ✅ | ✅ | ✅ | ⚠ tight | ❌ |
| 192 GB (M2 Ultra) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| 512 GB (M3 Ultra) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ comfortable |
Mac mini 16GB / 24GB sweet spot: not this model — use batiai/fara-7b (Microsoft Fara 7B, also multimodal) or batiai/qwen3.6-27b.
M4 Max 128GB recommendation: :q4 (65 GB) is the sweet spot — full quality with room for context.
Ollama ships text-only. For image input, use llama.cpp with the separate mmproj:
hf download batiai/Llama-4-Scout-17B-16E-Instruct-GGUF \
--include "*Q4_K_M*" --include "mmproj-*-Q6_K.gguf" \
--local-dir ./llama4-scout
llama-mtmd-cli \
-m ./llama4-scout/meta-llama-Llama-4-Scout-17B-16E-Instruct-Q4_K_M.gguf \
--mmproj ./llama4-scout/mmproj-meta-llama-Llama-4-Scout-17B-16E-Instruct-Q6_K.gguf \
--image input.jpg \
-p "Describe this image."
general.author=BatiAI, general.url=https://flow.bati.ai)Inherits Meta Llama 4 Community License. Commercial-friendly for orgs with < 700M MAU. - Llama 4 License - Acceptable Use Policy
flow.bati.ai — free, on-device AI automation for Mac. 5 MB app, 100 % local, unlimited.