3,046 Downloads Updated 6 days ago
ollama run carstenuhlig/omnicoder-9b
ollama launch claude --model carstenuhlig/omnicoder-9b
ollama launch codex --model carstenuhlig/omnicoder-9b
ollama launch opencode --model carstenuhlig/omnicoder-9b
ollama launch openclaw --model carstenuhlig/omnicoder-9b
Qwen3.5-9B fine-tuned on 425K agentic coding trajectories from Claude Opus 4.6, GPT-5.4, and Gemini 3.1 Pro. Trained on what frontier agents actually do when editing code: read before writing, trace errors to the root cause, write diffs not rewrites.
Weaker on niche languages (Haskell, MATLAB, assembly) and general knowledge — the training data skews heavily toward Python/JS.
| Tag | Size |
|---|---|
latest / q4_k_m |
5.7 GB |
q8_0 |
9.5 GB |
From the r/LocalLLaMA thread:
“The single biggest failure mode we hit with smaller models in agentic loops is they just start writing code without checking what’s already there. Ends up clobbering imports, duplicating functions, the usual mess.”
Test on 2× RTX 5060 Ti (Q8): matched a 30B MoE on a FastAPI refactoring task. OmniCoder handled async/sync correctly and produced a clean single diff. The 30B duplicated the diff block and mixed up AsyncSession with sync Session. Prompt eval: 3076 tok/s.
For agentic use, lower temperature to 0.2–0.4. Benchmarks and training details: huggingface.co/Tesslate/OmniCoder-9B