6 19 hours ago

tools 30b
ollama run brianmatzelle/qwen3-coder-heretic:30b

Details

19 hours ago

4811c363f2e6 · 19GB ·

qwen3moe
·
30.5B
·
Q4_K_M
{ "repeat_penalty": 1.05, "stop": [ "<|im_start|>", "<|im_end|>", "<

Readme

Qwen3-Coder 30B (Heretic)

An abliterated build of Qwen/Qwen3-Coder-30B-A3B-Instruct, processed with Heretic to remove refusal behavior while preserving general capability. Repackaged with the Modelfile directives Ollama needs to advertise the tools capability.

TL;DR

ollama pull brianmatzelle/qwen3-coder-heretic:30b

Why this exists

The upstream mradermacher GGUF works fine with llama.cpp and llama-server, but importing it straight into Ollama lands a model with Capabilities: completion only — no tools, so OpenCode and other tool-use clients reject it. That’s an Ollama-detection issue, not a model issue: Ollama keys tool support on the Modelfile-level RENDERER/PARSER directives, which are missing on a bare FROM /path/to.gguf import.

This build adds the right directives so tools shows up and OpenCode can actually use the model.

What’s in the Modelfile

FROM <gguf>
RENDERER qwen3-coder
PARSER qwen3-coder
PARAMETER stop <|im_start|>
PARAMETER stop <|im_end|>
PARAMETER stop <|endoftext|>
PARAMETER temperature 0.7
PARAMETER top_k 20
PARAMETER top_p 0.8
PARAMETER repeat_penalty 1.05

The qwen3-coder renderer/parser are registered in Ollama 0.24+ (filename qwen3coder.go, but they register under the hyphenated name qwen3-coder — that gotcha cost me an hour).

Verifying tools work

curl -s http://localhost:11434/api/chat -d '{
  "model":"brianmatzelle/qwen3-coder-heretic:30b",
  "messages":[{"role":"user","content":"What is 17 times 23? Use the calculator tool."}],
  "tools":[{"type":"function","function":{"name":"calculator","description":"multiplies two numbers","parameters":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}},"required":["a","b"]}}}],
  "stream":false
}' | jq .message

You should get a clean tool_calls array with arguments: {a: 17, b: 23}.

OpenCode config

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama",
      "options": { "baseURL": "http://localhost:11434/v1" },
      "models": {
        "brianmatzelle/qwen3-coder-heretic:30b": { "name": "Qwen3-Coder 30B (Heretic)" }
      }                                                                                                                                                                                                                                                                                                                                                    }
  },
  "model": "ollama/brianmatzelle/qwen3-coder-heretic:30b"
}

Caveats

  • Abliteration trades a bit of capability for compliance. Heretic explicitly minimizes KL divergence from the original to limit this, but expect minor regressions on edge-case reasoning and structured output relative to the stock Qwen3-Coder.
  • Outputs are uncensored by design. Use accordingly — this is suited for personal research, not production-facing applications.
  • Inherits the upstream Apache 2.0 license.

Credits

  • Base model: Qwen — Qwen3-Coder
  • Abliteration: huihui-ai (initial), Heretic by p-e-w (method)
  • GGUF + imatrix: mradermacher
  • Ollama packaging: this repo — added RENDERER qwen3-coder / PARSER qwen3-coder so tool capability is detected