2,148 3 weeks ago

Unsloth Dynamic 2.0 Quants achieves 1M tokens & superior accuracy & SOTA quantization performance. Select UD-IQ3_XXS for 16GB VRAM, UD-Q4_K_XL for 24GB VRAM, or UD-Q5_K_XL/UD-Q6_K_XL for 32GB VRAM.

tools

1 month ago

56720d55401e · 18GB

qwen3moe
·
30.5B
·
Q4_K_M
{ "min_p": 0, "repeat_penalty": 1.05, "stop": [ "<|im_start|>", "<|im_en
Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US
{{- $lastUserIdx := -1 -}} {{- range $idx, $msg := .Messages -}} {{- if eq $msg.Role "user" }}{{ $la

Readme

Qwen3 Coder 30B model for consumer-grade graphics cards, Unsloth Dynamic 2.0 quantized version with 1M tokens. For graphics cards with 16GB of VRAM, the UD-IQ3_XXS is recommended. For graphics cards with 24GB of VRAM, the UD-Q4_K_XL is recommended. For graphics cards with 32GB of VRAM, the UD-Q5_K_XL is recommended. Remember using Environment Variable OLLAMA_CONTEXT_LENGTH to adjust context length. With 16GB of VRAM and UD-IQ3_XXS, the recommended context length is 8192.