huihui_ai/kimi-k2-abliterated:1026b-instruct-0905-Q2_K

huihui_ai/ kimi-k2-abliterated:1026b-instruct-0905-Q2_K

520 Downloads Updated 3 months ago

A state-of-the-art mixture-of-experts (MoE) language model. Kimi K2-Instruct-0905 demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks.

tools 1026b

ollama run huihui_ai/kimi-k2-abliterated:1026b-instruct-0905-Q2_K

curl http://localhost:11434/api/chat \
  -d '{
    "model": "huihui_ai/kimi-k2-abliterated:1026b-instruct-0905-Q2_K",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='huihui_ai/kimi-k2-abliterated:1026b-instruct-0905-Q2_K',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'huihui_ai/kimi-k2-abliterated:1026b-instruct-0905-Q2_K',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 3 months ago

3 months ago

9688551c25a1 · 373GB ·

model

archdeepseek2

·

parameters1.03T

·

quantizationQ2_K

373GB

template

{{ if .Tools }}<|im_system|>tool_declare<|im_middle|>{{ .Tools }}<|im_end|>{{ end }} {{ range $index

728B

license

Modified MIT License Copyright (c) 2025 Moonshot AI Permission is hereby granted, free of charge, to

1.5kB

params

{ "num_gpu": 1, "repeat_penalty": 1, "stop": [ "<|im_start|>", "<|im_use

156B

Readme

This is an uncensored version of unsloth/Kimi-K2-Instruct-0905-BF16 created with abliteration.

Parameter description

1. num_gpu
The value of num_gpu inside the model is 1, which means it defaults to loading one layer. All others will be loaded into CPU memory. You can modify num_gpu according to your GPU configuration.

/set parameter num_gpu 2

2. num_thread
“num_thread” refers to the number of cores in your computer, and it’s recommended to use half of that, Otherwise, the CPU will be at 100%.

/set parameter num_thread 32

3. num_ctx
“num_ctx” for ollama refers to the number of context slots or the number of contexts the model can maintain during inference.

/set parameter num_ctx 40960

References

Donation

You can follow x.com/support_huihui to get the latest model information from huihui.ai.

Your donation helps us continue our further development and improvement, a cup of coffee can do it.

bitcoin:

  bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge

This is an uncensored version of [unsloth/Kimi-K2-Instruct-0905-BF16](https://huggingface.co/unsloth/Kimi-K2-Instruct-0905-BF16) created with abliteration.

#### Parameter description
**1. num_gpu**  
The value of `num_gpu` inside the model is 1, which means it defaults to loading one layer. All others will be loaded into CPU memory. You can modify `num_gpu` according to your GPU configuration.

```
/set parameter num_gpu 2
```
**2. num_thread**  
"num_thread" refers to the number of cores in your computer, and it's recommended to use half of that, Otherwise, the CPU will be at 100%.
```
/set parameter num_thread 32
```
**3. num_ctx**  
"num_ctx" for ollama refers to the number of context slots or the number of contexts the model can maintain during inference.
```
/set parameter num_ctx 40960
```

### References
[HuggingFace](https://huggingface.co/huihui-ai/Kimi-K2-Instruct-GGUF)

### Donation

You can follow [x.com/support_huihui](https://x.com/support_huihui) to get the latest model information from huihui.ai.

##### Your donation helps us continue our further development and improvement, a cup of coffee can do it.
- bitcoin:
```
  bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge
```

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)