151 5 days ago

Fine-tuned Qwen3.5-9B with distilled reasoning and full vision support. 883 tensors — vision tower preserved byte-for-byte from base. R5 was the first vision-capable distilled model.

vision tools thinking
ollama run robit/qwen3.5-9b-r5-vision:q4km

Applications

Claude Code
Claude Code ollama launch claude --model robit/qwen3.5-9b-r5-vision:q4km
Codex
Codex ollama launch codex --model robit/qwen3.5-9b-r5-vision:q4km
OpenCode
OpenCode ollama launch opencode --model robit/qwen3.5-9b-r5-vision:q4km
OpenClaw
OpenClaw ollama launch openclaw --model robit/qwen3.5-9b-r5-vision:q4km

Models

View all →

Readme

r5_vision_nutrition_label.png


Qwen3.5-9B R5 Vision (Q4_K_M)

Fine-tuned Qwen3.5-9B with distilled reasoning and full vision support. 883 tensors — vision tower preserved byte-for-byte from base. R5 was the first vision-capable distilled model. Superseded by R7 vision (86.8% eval + PrimeIntellect data).

Capabilities

  • Vision — image understanding (reads text, describes scenes, answers visual questions)
  • Thinking — structured reasoning in <think> blocks
  • Tool calling — structured tool_calls via Ollama /api/chat
  • Instruction following — concise answers, format constraints

Eval Results

Benchmark Score
Diverse stochastic eval (38 tests) 84.2%
Vision probe (rendered text) PASS
Tool calling PASS

Training

Quickstart

ollama run robit/qwen3.5-9b-r5-vision:q4km

Image chat

IMG64=$(base64 -w0 path/to/image.jpg)
curl -s http://localhost:11434/api/chat \
  -d '{"model":"robit/qwen3.5-9b-r5-vision:q4km","messages":[{"role":"user","content":"Describe this image.","images":["'"$IMG64"'"]}]}'

Parameters

  • RENDERER qwen3.5 + PARSER qwen3.5
  • num_ctx 131072
  • temperature 0.6, top_p 0.95
  • stop "<|im_end|>"

Note

R5 Vision is superseded by robit/qwen3.5-9b-r7-research-vision:q4km which adds PrimeIntellect data and scores 86.8%.

License

Derived from Qwen3.5-9B (Apache 2.0). Training data licenses vary by source.