89 1 month ago

Qwen3-Thinking-2507 is the continuation of Qwen3 thinking model, with improved quality and depth of reasoning. Qwen3-Instruct-2507 is the updated version of the previous Qwen3 non-thinking mode. (quantized UD-Q4_K_XL, thinking and instruct versions)

tools thinking 30b

1 month ago

c140a12a8cca · 18GB

qwen3moe
·
30.5B
·
Q4_K_M
{ "min_p": 0, "presence_penalty": 1, "stop": [ "<|im_start|>", "<|im_end
{{- $lastUserIdx := -1 -}} {{- range $idx, $msg := .Messages -}} {{- if eq $msg.Role "user" }}{{ $la

Readme

Feature Value
vision false
thinking by version
tools true
Device Speed Version
RTX 3090 24gb ~105 token/s thinking
M1 Max 32gb ~51 token/s thinking
RTX 3090 24gb ~107 token/s non thinking
M1 Max 32gb ~53 token/s non thinking