132 2 months ago

Qwen3-Thinking-2507 is the continuation of Qwen3 thinking model, with improved quality and depth of reasoning. Qwen3-Instruct-2507 is the updated version of the previous Qwen3 non-thinking mode. (quantized UD-Q4_K_XL, thinking and instruct versions)

tools thinking 30b

Models

View all →

Readme

Feature Value
vision false
thinking by version
tools true
Device Speed Version
RTX 3090 24gb ~105 token/s thinking
M1 Max 32gb ~51 token/s thinking
RTX 3090 24gb ~107 token/s non thinking
M1 Max 32gb ~53 token/s non thinking