isotnek/qwen3.5

Applications

Claude Code ollama launch claude --model isotnek/qwen3.5:9B-Unsloth-UD-Q4_K_XL

OpenClaw ollama launch openclaw --model isotnek/qwen3.5:9B-Unsloth-UD-Q4_K_XL

Hermes Agent ollama launch hermes --model isotnek/qwen3.5:9B-Unsloth-UD-Q4_K_XL

Codex ollama launch codex --model isotnek/qwen3.5:9B-Unsloth-UD-Q4_K_XL

OpenCode ollama launch opencode --model isotnek/qwen3.5:9B-Unsloth-UD-Q4_K_XL

This is an Unsloth quantization of Qwen3.5-9B. For a full list of other quants, the linked Unsloth HF repo is a great source. This specific quant was selected because of the analysis in this blog post, which found that this quant is a good balance of performance preservation and model size reduction.

This model is text-only because Ollama doesn’t yet support specifying mmproj files when creating Ollama Modelfiles with GGUFs. Still, this is a great model made better & faster by the good folks at Unsloth.

This model is set to reason by default. To disable reasoning in the Ollama CLI you can run:

ollama run isotnek/qwen3.5:9B-Unsloth-UD-Q4_K_XL

and then enter “/set nothink” in the chat window.

To run this (and other models in the linked Unsloth repo) for multimodal inference, I recommend instead using llama.cpp. To do so, download your desired model GGUF and mmproj-*.gguf files, and then:

brew install llama.cpp

llama-server \
  -m ./Qwen3.5-9B-UD-Q4_K_XL.gguf \ # or your preferred model file, if different
  --mmproj ./mmproj-BF16.gguf \ # or your preferred mmproj, if different
  --host 0.0.0.0 \
  --port 8080 \
  -ngl 99 # enables the use of Apple's metal accelerator

and to run inference:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is in this image?"},
        {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,'$(base64 -i /Path/To/Your/Image.png)'"}}
      ]
    }]
  }'

TEXT-ONLY Unsloth Quantization of Qwen3.5:9B

Applications

Models

Readme

*TEXT-ONLY* Unsloth Quantization of Qwen3.5:9B

Applications

Models

Readme

TEXT-ONLY Unsloth Quantization of Qwen3.5:9B