186 Downloads Updated yesterday
ollama run rafw007/qwen35-claude-coder:9b
Updated yesterday
yesterday
e1c2926abfe5 · 6.6GB ·
A family of custom models built on Qwen3.5, tuned to act as autonomous coding and administration agents. They speak the Anthropic-compatible API, so they drive Claude Code fully locally — your code never leaves your machine and cloud token cost drops to zero.
Each model ships with a system prompt focused on real work in a terminal: use tools instead of guessing, write files instead of pasting code, ground every answer in real tool output, and stay terse. Thinking is suppressed so the model acts immediately instead of monologuing. Context is set to 64K to match Claude Code’s recommended minimum.
| Model | Base | Context | Purpose |
|---|---|---|---|
| qwen35-claude-coder:4b | Qwen3.5 4B (GGUF) | 64K | Fast everyday coding agent. Lightest on memory, runs on 16GB Apple Silicon. |
| qwen35-claude-coder:9b | Qwen3.5 9B (GGUF) | 64K | Stronger coding agent — production-quality code, better reasoning and tool use. Best on 32GB+. |
MLX builds (
*-mlx) are published on Hugging Face because the ollama.com registry currently does not accept MLX-format manifests.
ollama launch claude --model <name>).ollama run rafw007/qwen35-claude-coder:9b
In Claude Code:
ollama launch claude --model rafw007/qwen35-claude-coder:9b
arp -a, nmap -sn, system_profiler rather than Linux-only commands.| Model | Placement | Verdict |
|---|---|---|
| qwen35-claude-coder:4b (GGUF) | 16GB, fits | Fast, correct algorithms; rougher edge cases. Good light agent. |
| qwen35-claude-coder:9b (GGUF) | 32GB, comfortable | Production-quality code, defensive, non-mutating. Zero emoji, zero hallucination in disk/network agent tests. |
Both pass end-to-end through Claude Code: real turns with tool calls and correct responses.
These models were designed, built and tested with the help of Claude Opus — the idea being that the best coding model in the world should be able to create smaller models in its own image. Their system prompts, parameters and context configuration come straight from that work: the world’s best coding model preparing local models that take over right on your desk.
Apache 2.0 (inherited from the base Qwen3.5).