stewartpark/ qwen3.5:27b-q8_0

1,368 2 months ago

Qwen3.5 model (27B dense) with always-on thinking, optimized for agentic coding, tool use, and browser automation.

vision tools thinking 27b
ollama run stewartpark/qwen3.5:27b-q8_0

Details

2 months ago

6960f7794400 · 30GB ·

qwen35
·
27.8B
·
Q8_0
Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US
{{- if .Messages }} {{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{
You are a helpful AI assistant. Core behaviors: - Think step-by-step inside <think> tags before resp
{ "min_p": 0, "num_ctx": 262144, "num_predict": 32768, "stop": [ "<|im_end|>

Readme

This is a customized Qwen3.5 collection designed for agentic workflows, available in three configurations:

  • qwen3.5:27b-bf16 (default) — 27B dense parameters in full BF16 precision (~56GB). Native 256K context window with always-enabled thinking mode. Maximum quality with no quantization loss. Requires 56GB+ VRAM (RTX A6000 + CPU offloading, A100/H100).
  • qwen3.5:27b-q8_0 — 27B dense parameters quantized to Q8_0 (~30GB). Native 256K context window with always-enabled thinking mode. Near-lossless quality at half the VRAM of BF16, accessible on 24GB GPUs (RTX 3090 / 4090) with minimal CPU offloading.

All variants use precision sampling parameters (temperature 0.6, top_p 0.95, top_k 20) and support up to 32K output tokens. The models handle multi-step tool calling, autonomous browser operations, and sophisticated coding assignments.