681 Downloads Updated 7 months ago
ollama run huihui_ai/Qwen3-Coder
ollama launch claude --model huihui_ai/Qwen3-Coder
ollama launch codex --model huihui_ai/Qwen3-Coder
ollama launch opencode --model huihui_ai/Qwen3-Coder
ollama launch openclaw --model huihui_ai/Qwen3-Coder
The template used is Qwen3’s template, which may not be suitable for tool calls.
1. num_gpu
The value of num_gpu inside the model is 1, which means it defaults to loading one layer. All others will be loaded into CPU memory. You can modify num_gpu according to your GPU configuration.
/set parameter num_gpu 2
2. num_thread
“num_thread” refers to the number of cores in your computer, and it’s recommended to use half of that, Otherwise, the CPU will be at 100%.
/set parameter num_thread 32
3. num_ctx
“num_ctx” for ollama refers to the number of context slots or the number of contexts the model can maintain during inference.
/set parameter num_ctx 4096
You can follow x.com/support_huihui to get the latest model information from huihui.ai.
bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge