109 Downloads Updated 2 weeks ago
ollama run iliafed/qwen3.6turboquant
ollama launch claude --model iliafed/qwen3.6turboquant
ollama launch codex-app --model iliafed/qwen3.6turboquant
ollama launch openclaw --model iliafed/qwen3.6turboquant
ollama launch hermes --model iliafed/qwen3.6turboquant
ollama launch codex --model iliafed/qwen3.6turboquant
ollama launch opencode --model iliafed/qwen3.6turboquant
# qwen3.6turboquant
Qwen 3.6 35B / 36B MoE for Ollama.
This model is published as Q4_K_M weights and is intended to be used with Ollama TurboQuant KV cache enabled via `tbqp3/tbq3`.
## Pull
```bash
ollama pull iliafed/qwen3.6turboquant
PowerShell:
$env:OLLAMA_FLASH_ATTENTION="1"
$env:OLLAMA_KV_CACHE_TYPE="tbqp3/tbq3"
$env:OLLAMA_CONTEXT_LENGTH="262144"
ollama run iliafed/qwen3.6turboquant
Linux/macOS:
OLLAMA_FLASH_ATTENTION=1 \
OLLAMA_KV_CACHE_TYPE=tbqp3/tbq3 \
OLLAMA_CONTEXT_LENGTH=262144 \
ollama run iliafed/qwen3.6turboquant
qwen35moe36.0BQ4_K_M2621442048tbqp3/tbq3 is an Ollama runtime KV-cache setting, not a model weight quantization format stored inside the model manifest.
The model weights are Q4_K_M. TurboQuant behavior is enabled by setting:
OLLAMA_FLASH_ATTENTION=1
OLLAMA_KV_CACHE_TYPE=tbqp3/tbq3
before running Ollama.
Ollama 0.24.0 does not accept:
ollama create --quantize tbq3
ollama create --quantize tbqp3
because those are not supported weight quantization targets. “`