117 5 months ago

This is the continuation of Qwen3 thinking model (MOE), with improved quality and depth of reasoning. (quantized UD-Q4_K_XL, thinking without switching off)

tools thinking 30b
ollama run second_constantine/qwen3-A3B:30b

Applications

Claude Code
Claude Code ollama launch claude --model second_constantine/qwen3-A3B:30b
Codex
Codex ollama launch codex --model second_constantine/qwen3-A3B:30b
OpenCode
OpenCode ollama launch opencode --model second_constantine/qwen3-A3B:30b
OpenClaw
OpenClaw ollama launch openclaw --model second_constantine/qwen3-A3B:30b

Models

View all →

Readme

Feature Value
vision false
thinking true (without switching off)
tools true
Device Speed, token/s Context VRAM, gb Versions
RTX 3090 24gb ~98 4096 18 UD-Q4_K_XL, 0.12.2
RTX 3090 24gb ~97 15360 20 UD-Q4_K_XL, 0.12.2
RTX 3090 24gb ~87 4096 17 IQ4_XS, 0.12.3
RTX 3090 24gb ~84 15360 18 IQ4_XS, 0.12.3
M1 Max 32gb ~49 4096 18 UD-Q4_K_XL, 0.12.2
M1 Max 32gb ~46 15360 18 UD-Q4_K_XL, 0.12.2