second_constantine/qwen3-A3B:30b

second_constantine/

qwen3-A3B:30b

39 Downloads Updated 1 month ago

This is the continuation of Qwen3 thinking model (MOE), with improved quality and depth of reasoning. (quantized UD-Q4_K_XL, thinking without switching off)

tools thinking 30b

Updated 1 month ago

1 month ago

c140a12a8cca · 18GB ·

archqwen3moe

parameters30.5B

quantizationQ4_K_M

18GB

{ "min_p": 0, "presence_penalty": 1, "stop": [ "<|im_start|>", "<|im_end

181B

{{- $lastUserIdx := -1 -}} {{- range $idx, $msg := .Messages -}} {{- if eq $msg.Role "user" }}{{ $la

1.5kB

Device	Speed, token/s	Context	VRAM, gb	Versions
RTX 3090 24gb	~98	4096	18	UD-Q4_K_XL, 0.12.2
RTX 3090 24gb	~97	15360	20	UD-Q4_K_XL, 0.12.2
RTX 3090 24gb	~87	4096	17	IQ4_XS, 0.12.3
RTX 3090 24gb	~84	15360	18	IQ4_XS, 0.12.3
M1 Max 32gb	~49	4096	18	UD-Q4_K_XL, 0.12.2
M1 Max 32gb	~46	15360	18	UD-Q4_K_XL, 0.12.2