210 Downloads Updated 2 weeks ago
Updated 2 weeks ago
2 weeks ago
e0390994c801 · 18GB ·
# Pull the quantized model
ollama pull anarko/qwen3-coder-flash:30b
# Run the model
ollama run anarko/qwen3-coder-flash:30b
Source Model: Qwen3-Coder-30B-A3B-Instruct (Hugging Face)
Quantized Version: UD‑Q4_K_XL
>>> /show parameters
Output:
Model defined parameters:
min_p 0
num_ctx 145408
repeat_penalty 1.05
stop "<|im_start|>"
stop "<|im_end|>"
temperature 0.7
top_k 20
top_p 0.8
Context Length (Tokens) | Approx. VRAM Usage (GB) |
---|---|
142 k (default) | 32 GB |
256 k | 44 GB |
This Ollama model is based on the Hugging Face
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF
. No changes had been made to it other than the context length.