This is a customized Qwen3.5 collection designed for agentic workflows, available in three configurations:
- qwen3.5:27b (default) — 27B dense parameters quantized to Q8_0 (~29GB). Native 256K context window with always-enabled thinking mode. Near-lossless quality at half the VRAM of BF16, accessible on 48GB GPUs like the RTX A6000 or dual-24GB setups.
- qwen3.5:27b-bf16 — Same 27B dense model in full BF16 precision (~54GB). Better for vision and browser automation workloads where full precision matters. Requires high-VRAM GPUs like the RTX PRO 6000 96GB.
- qwen3.5:122b — 122B Mixture-of-Experts architecture with 256 experts and 10B active parameters. Same 256K context and thinking capabilities, engineered for advanced reasoning where GPU memory permits.
All variants use precision sampling parameters (temperature 0.6, top_p 0.95, top_k 20) and support up to 32K output tokens. The models handle multi-step tool calling, autonomous browser operations, and sophisticated coding assignments.