Fine-tuned Qwen3.5-9B with distilled reasoning and full vision support. 883 tensors — vision tower preserved byte-for-byte from base. R5 was the first vision-capable distilled model.

Applications

Claude Code ollama launch claude --model robit/qwen3.5-9b-r5-vision:q4km

Codex App ollama launch codex-app --model robit/qwen3.5-9b-r5-vision:q4km

OpenClaw ollama launch openclaw --model robit/qwen3.5-9b-r5-vision:q4km

Hermes Agent ollama launch hermes --model robit/qwen3.5-9b-r5-vision:q4km

Codex ollama launch codex --model robit/qwen3.5-9b-r5-vision:q4km

OpenCode ollama launch opencode --model robit/qwen3.5-9b-r5-vision:q4km

Qwen3.5-9B R5 Vision (Q4_K_M)

Fine-tuned Qwen3.5-9B with distilled reasoning and full vision support. 883 tensors — vision tower preserved byte-for-byte from base. R5 was the first vision-capable distilled model. Superseded by R7 vision (86.8% eval + PrimeIntellect data).

Capabilities

Vision — image understanding (reads text, describes scenes, answers visual questions)
Thinking — structured reasoning in <think> blocks
Tool calling — structured tool_calls via Ollama /api/chat
Instruction following — concise answers, format constraints

Eval Results

Benchmark	Score
Diverse stochastic eval (38 tests)	84.2%
Vision probe (rendered text)	PASS
Tool calling	PASS

Training

Base model: Qwen/Qwen3.5-9B
Method: LoRA SFT (r=32, alpha=64, LR=1e-4, 1 epoch), merged via llama-export-lora
Data: 4122 samples from:
- bespokelabs/Bespoke-Stratos-17k — DeepSeek-R1 reasoning traces
- allenai/tulu-3-sft-mixture — instruction diversity
- Open-Orca/SlimOrca — curated GPT-4 instructions
Training suite: robit-man/fine_tuning_suite

Quickstart

ollama run robit/qwen3.5-9b-r5-vision:q4km

Image chat

IMG64=$(base64 -w0 path/to/image.jpg)
curl -s http://localhost:11434/api/chat \
  -d '{"model":"robit/qwen3.5-9b-r5-vision:q4km","messages":[{"role":"user","content":"Describe this image.","images":["'"$IMG64"'"]}]}'

Parameters

RENDERER qwen3.5 + PARSER qwen3.5
num_ctx 131072
temperature 0.6, top_p 0.95
stop "<|im_end|>"

Note

R5 Vision is superseded by robit/qwen3.5-9b-r7-research-vision:q4km which adds PrimeIntellect data and scores 86.8%.

License

Derived from Qwen3.5-9B (Apache 2.0). Training data licenses vary by source.