stewartpark/qwen3.5

Qwen3.5 models (27B dense + 122B MoE) with always-on thinking, optimized for agentic coding, tool use, and browser automation.

Applications

Claude Code ollama launch claude --model stewartpark/qwen3.5

Codex ollama launch codex --model stewartpark/qwen3.5

OpenCode ollama launch opencode --model stewartpark/qwen3.5

OpenClaw ollama launch openclaw --model stewartpark/qwen3.5

This is a customized Qwen3.5 collection designed for agentic workflows, available in three configurations:

qwen3.5:27b (default) — 27B dense parameters quantized to Q8_0 (~29GB). Native 256K context window with always-enabled thinking mode. Near-lossless quality at half the VRAM of BF16, accessible on 48GB GPUs like the RTX A6000 or dual-24GB setups.
qwen3.5:27b-bf16 — Same 27B dense model in full BF16 precision (~54GB). Better for vision and browser automation workloads where full precision matters. Requires high-VRAM GPUs like the RTX PRO 6000 96GB.
qwen3.5:122b — 122B Mixture-of-Experts architecture with 256 experts and 10B active parameters. Same 256K context and thinking capabilities, engineered for advanced reasoning where GPU memory permits.

All variants use precision sampling parameters (temperature 0.6, top_p 0.95, top_k 20) and support up to 32K output tokens. The models handle multi-step tool calling, autonomous browser operations, and sophisticated coding assignments.

Qwen3.5 models (27B dense + 122B MoE) with always-on thinking, optimized for agentic coding, tool use, and browser automation.

Applications

Models

Readme