ollama run batiai/fara-7b:q6
Microsoft’s agentic multimodal model in a 7B package. imatrix-calibrated GGUF quantizations of the official microsoft/Fara-7B (Qwen 2.5 VL backbone, MIT), released 2026-05-19 by Microsoft Research. Free, unlimited, on-device AI for Mac via BatiFlow.
Multimodal-capable (vision) via separate mmproj on Hugging Face. Ollama ships text-only.
| Tag | Size | Min RAM | Use case |
|---|---|---|---|
:iq3 |
3.1 GB | 8 GB | Smallest footprint — Mac mini M4 16GB OK |
:q3 |
3.8 GB | 8 GB | K-quant alt for iq3 |
:iq4 |
4.2 GB | 10 GB | Best size/quality ratio |
:q4 |
4.7 GB | 10 GB | Recommended for most users |
:q5 |
5.4 GB | 12 GB | Higher fidelity |
:q6 |
6.3 GB | 14 GB | Near-original quality |
:q8 |
8.1 GB | 16 GB | Original-grade |
All 7 are imatrix-calibrated (wikitext-2-raw).
ollama pull batiai/fara-7b:q4
ollama run batiai/fara-7b:q4
Microsoft Research’s agentic multimodal model built on Qwen 2.5 VL backbone. Designed for screen understanding + agent flows (web automation, UI navigation, document analysis).
| Your Mac | :iq3 3GB |
:q3 4GB |
:iq4 4GB |
:q4 5GB |
:q5 5GB |
:q6 6GB |
:q8 8GB |
|---|---|---|---|---|---|---|---|
| 16 GB | ✅ | ✅ | ✅ | ✅ | ✅ | ⚠ tight | ❌ |
| 24 GB | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| 32 GB+ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
16 GB Mac sweet spot: :q4 or :q5. Smallest footprint at this quality tier in the BatiAI catalog.
Ollama ships text-only. For image input, use llama.cpp with the separate mmproj:
hf download batiai/Fara-7B-GGUF --include "*Q4_K_M*" --include "mmproj-*-Q6_K.gguf" \
--local-dir ./fara-7b
llama-mtmd-cli \
-m ./fara-7b/microsoft-Fara-7B-Q4_K_M.gguf \
--mmproj ./fara-7b/mmproj-microsoft-Fara-7B-Q6_K.gguf \
--image input.jpg \
-p "Describe this image."
general.author=BatiAI, general.url=https://flow.bati.ai)Inherits source: MIT.
flow.bati.ai — free, on-device AI automation for Mac. 5 MB app, 100 % local, unlimited.