second_constantine/deepseek-coder-v2:16b-Q4_K

second_constantine/ deepseek-coder-v2:16b-Q4_K_M

4,053 Downloads Updated 2 months ago

This is a brand new Mixture of Export (MoE) model from DeepSeek, specializing in coding instructions. (quantized IQ4_XS)

tools 16b

ollama run second_constantine/deepseek-coder-v2:16b-Q4_K_M

curl http://localhost:11434/api/chat \
  -d '{
    "model": "second_constantine/deepseek-coder-v2:16b-Q4_K_M",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='second_constantine/deepseek-coder-v2:16b-Q4_K_M',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'second_constantine/deepseek-coder-v2:16b-Q4_K_M',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 2 months ago

2 months ago

feed877ce514 · 10GB ·

model

archdeepseek2

parameters15.7B

quantizationQ4_K_M

10GB

template

{{- if .Messages }} {{- if or .System .Tools }} {{- if .System }} {{ .System }} {{- end }} {{- if .T

1.3kB

params

{ "stop": [ "User:", "Assistant:" ] }

32B

Readme

Based on https://huggingface.co/lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF

Feature	Value
vision	false
thinking	false
tools	true

Device	Speed, token/s	Context	Memory, gb	Versions
RTX 3090 24gb	~214	8192	11	IQ4_XS, 0.15.1
RTX 3090 24gb	~213	48k	23	IQ4_XS, 0.15.1
i5-1135G7 + 2080ti 11gb	~54	8192	11 (6%/94% CPU/GPU)	IQ4_XS, 0.15.1
i7-12700H + 3070ti Mobile 8gb	~25	8192	11 (35%/65% CPU/GPU)	IQ4_XS, 0.15.1
M1 Max 32gb	~84	8192	11	IQ4_XS, 0.15.1
M1 Max 32gb	~80	55k	25	IQ4_XS, 0.15.1