sammcj/qwen2.5-coder-32b-128k:q6

sammcj/ qwen2.5-coder-32b-128k:q6_k

648 Downloads Updated 1 year ago

Qwen2.5 Coder 32B with the corrected 128k context

tools

ollama run sammcj/qwen2.5-coder-32b-128k:q6_k

curl http://localhost:11434/api/chat \
  -d '{
    "model": "sammcj/qwen2.5-coder-32b-128k:q6_k",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='sammcj/qwen2.5-coder-32b-128k:q6_k',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'sammcj/qwen2.5-coder-32b-128k:q6_k',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 year ago

1 year ago

bf89d7131e9e · 27GB ·

model

archqwen2

parameters32.8B

quantizationQ6_K

27GB

system

You are Qwen, created by Alibaba Cloud. You are a helpful assistant. You are an expert programmer. Y

538B

template

{{- if .Suffix }}<|fim_prefix|>{{ .Prompt }}<|fim_suffix|>{{ .Suffix }}<|fim_middle|> {{- else if .M

1.6kB

params

{ "min_p": 0.9, "num_ctx": 32768, "num_keep": 256, "repeat_penalty": 1.05, "stop

130B

Readme

This uses Unsloth’s GGUF which fixes the context length (the official Ollama model is wrong).

https://huggingface.co/unsloth/Qwen2.5-Coder-32B-Instruct-128K-GGUF/blob/main/Qwen2.5-Coder-32B-Instruct-Q6_K.gguf