475 Downloads Updated 2 months ago
Updated 2 months ago
2 months ago
88cc37b8585d · 18GB ·
# Pull the quantized model
ollama pull anarko/qwen3-coder-flash:30b
# Run the model
ollama run anarko/qwen3-coder-flash:30b
Source Model: Qwen3-Coder-30B-A3B-Instruct (Hugging Face)
Quantized Version: UD‑Q4_K_XL
>>> /show parameters
Output:
Model defined parameters:
min_p 0
num_ctx 145408
repeat_penalty 1.05
stop "<|im_start|>"
stop "<|im_end|>"
temperature 0.7
top_k 20
top_p 0.8
| Context Length (Tokens) | Approx. VRAM Usage (GB) |
|---|---|
| 142 k (default) | 32 GB |
| 256 k | 44 GB |
This Ollama model is based on the Hugging Face
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF. No changes had been made to it other than the context length.