huihui_ai/Qwen3-Coder

huihui_ai/

Qwen3-Coder:latest

611 Downloads Updated 4 months ago

This is not the ablation version. Qwen3-Coder featuring the following key enhancements: Significant Performance, Long-context Capabilities, Agentic Coding.

tools thinking 480b

Updated 4 months ago

4 months ago

e7885187ca40 · 175GB ·

archqwen3moe

·

parameters480B

·

quantizationQ2_K

175GB

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US

11kB

{ "num_gpu": 1, "repeat_penalty": 1, "stop": [ "<|im_start|>", "<|im_end

132B

{{- $lastUserIdx := -1 -}} {{- range $idx, $msg := .Messages -}} {{- if eq $msg.Role "user" }}{{ $la

1.7kB

Readme

Template

The template used is Qwen3’s template, which may not be suitable for tool calls.

Parameter description

1. num_gpu
The value of num_gpu inside the model is 1, which means it defaults to loading one layer. All others will be loaded into CPU memory. You can modify num_gpu according to your GPU configuration.

/set parameter num_gpu 2

2. num_thread
“num_thread” refers to the number of cores in your computer, and it’s recommended to use half of that, Otherwise, the CPU will be at 100%.

/set parameter num_thread 32

3. num_ctx
“num_ctx” for ollama refers to the number of context slots or the number of contexts the model can maintain during inference.

/set parameter num_ctx 4096

References

Qwen/qwen3-coder

Donation

You can follow x.com/support_huihui to get the latest model information from huihui.ai.

Your donation helps us continue our further development and improvement, a cup of coffee can do it.

bitcoin:

  bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge

### Template
The template used is Qwen3's template, which may not be suitable for tool calls.

#### Parameter description
**1. num_gpu**  
The value of `num_gpu` inside the model is 1, which means it defaults to loading one layer. All others will be loaded into CPU memory. You can modify `num_gpu` according to your GPU configuration.

```
/set parameter num_gpu 2
```
**2. num_thread**  
"num_thread" refers to the number of cores in your computer, and it's recommended to use half of that, Otherwise, the CPU will be at 100%.
```
/set parameter num_thread 32
```
**3. num_ctx**  
"num_ctx" for ollama refers to the number of context slots or the number of contexts the model can maintain during inference.
```
/set parameter num_ctx 4096
```

### References

[Qwen/qwen3-coder](https://huggingface.co/collections/Qwen/qwen3-coder-687fc861e53c939e52d52d10)

### Donation

You can follow [x.com/support_huihui](https://x.com/support_huihui) to get the latest model information from huihui.ai.

##### Your donation helps us continue our further development and improvement, a cup of coffee can do it.
- bitcoin:
```
  bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge
```

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)