huihui_ai/deepseek-r1-pruned:411b-coder-q4_K

huihui_ai/

deepseek-r1-pruned:411b-coder-q4_K_M

75 Downloads Updated 5 months ago

DeepSeek-R1-Pruned-Coder-411B is a pruned version of the DeepSeek-R1 reduced from 256 experts to 160 experts, The pruned model is mainly used for code generation.

411b

Updated 5 months ago

5 months ago

014981c453d8 · 257GB

model

archdeepseek2

parameters426B

quantizationQ4_K_M

257GB

template

{{- if .System }}{{ .System }}{{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice

387B

license

1.1kB

params

{ "num_gpu": 1, "stop": [ "<｜begin▁of▁sentence｜>", "<｜end▁of▁s

160B

Readme

Parameter description

1. num_gpu
The value of num_gpu inside the model is 1, which means it defaults to loading one layer. All others will be loaded into CPU memory. You can modify num_gpu according to your GPU configuration.

/set parameter num_gpu 2

2. num_thread
“num_thread” refers to the number of cores in your computer, and it’s recommended to use half of that, Otherwise, the CPU will be at 100%.

/set parameter num_thread 32

3. num_ctx
“num_ctx” for ollama refers to the number of context slots or the number of contexts the model can maintain during inference.

/set parameter num_ctx 4096

Donation

You can follow x.com/support_huihui to get the latest model information from huihui.ai.

Your donation helps us continue our further development and improvement, a cup of coffee can do it.

bitcoin:

  bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge