227 5 days ago

A lightweight, variant of Qwen3.6-35B-A3B using Q4_K_M quantization. Modelfile Designed to fit within 24 GB total VRAM with a 16K context window.

tools thinking
a6b253d76a2f · 180B
{
"min_p": 0,
"num_ctx": 16384,
"num_gpu": 99,
"presence_penalty": 1.5,
"repeat_penalty": 1,
"stop": [
"<|im_start|>",
"<|im_end|>"
],
"temperature": 1,
"top_k": 20,
"top_p": 0.95
}