69 1 week ago

ollama run batiai/kimi-k2.6:q5

Details

1 week ago

4eb34e1ae019 · 728GB ·

deepseek2
·
1.03T
·
Q5_K_M
You are a helpful AI assistant.
{ "num_ctx": 131072, "stop": [ "<|im_end|>", "[EOS]", "[EOT]" ],

Readme

Kimi K2.6 — Quantized by BatiAI

Frontier 1T MoE from Moonshot AI, quantized directly from official FP8 weights.

Models

Tag Size Min RAM Target Hardware
q5 728GB 768GB 2× M3 Ultra 512GB / 8× A100 80GB / H100 node — highest quality
iq4 546GB 512GB M3 Ultra 512GB / 8× A100 80GB / H100 node — recommended
iq3 394GB 384GB M3 Ultra 512GB / H100 node — most accessible

Quick Start

ollama run batiai/kimi-k2.6:iq4    # recommended balance
ollama run batiai/kimi-k2.6:iq3    # smaller, fits 384GB+ RAM
ollama run batiai/kimi-k2.6:q5     # highest quality, needs 768GB+ RAM

Kimi K2.6 — Why It Matters

  • 1T parameters / 32B active — frontier-class open weight model
  • SWE-Bench Pro 58.6 — beats GPT-5.4 xhigh (57.7), Claude Opus 4.6 max (53.4), Gemini 3.1 Pro (54.2)
  • HLE 36.4% / 55.5% (w/ tools) — Humanity’s Last Exam frontier tier
  • 256K native context via YARN scaling
  • Agent swarm — 300 sub-agents, 4,000 coordinated steps
  • Modified-MIT license — commercial + redistribution allowed
  • Released 2026-04-20 by Moonshot AI

Hardware Reality — Be Honest

Your System IQ3 (394GB) IQ4 (546GB) Q5 (728GB)
Mac 16GB
Mac 128GB
Mac 256GB ⚠️ heavy swap
Mac 384GB ⚠️ tight
Mac M3 Ultra 512GB ✅ tight
2× M3 Ultra (cluster)
8× A100 80GB
H100 node ✅ fast ✅ fast ✅ fast

This is not a consumer Mac model. For on-device Mac use, see below.

For Smaller Macs — BatiAI Lineup

Your Mac Recommended
16GB batiai/gemma4-e4b:q4
24GB batiai/gemma4-26b:iq4
48GB batiai/qwen3.5-35b:iq4
96GB batiai/qwen3.6-35b:iq4
128GB batiai/minimax-m2.7:iq3
M3 Ultra 512GB+ batiai/kimi-k2.6:iq4 (this model)

Why BatiAI?

  • Quantized directly from official Moonshot FP8 weights (not 3rd-party re-quant)
  • imatrix calibration with 200 chunks (quality saturation point)
  • Full general.author=BatiAI / general.url=https://flow.bati.ai signature
  • Open pipeline — see docs/202604-large-moe-quantization.md
  • Handles 1T+ MoE models (most providers stop at 70B)

Built for BatiFlow (plus frontier research)

BatiFlow is our on-device Mac AI automation app (free, unlimited, local). The smaller models in our lineup (gemma4, qwen3.5-35b, qwen3.6, minimax-m2.7) serve BatiFlow users directly.

Kimi K2.6 is different — it’s a frontier research / workstation model, beyond consumer hardware reach. We quantize it to demonstrate the pipeline handles the full frontier and for researchers / teams with workstation-class GPUs.