koesn/wizardlm2-7b:Q4_K

koesn/ wizardlm2-7b:Q4_K_M

267 Downloads Updated 1 year ago

Fixed num_ctx to 32768. This WizardLM-2 7B model is ready to use for full model's 32k contexts window.

ollama run koesn/wizardlm2-7b:Q4_K_M

curl http://localhost:11434/api/chat \
  -d '{
    "model": "koesn/wizardlm2-7b:Q4_K_M",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='koesn/wizardlm2-7b:Q4_K_M',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'koesn/wizardlm2-7b:Q4_K_M',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 year ago

1 year ago

cf9ad469dee1 · 4.4GB ·

model

archllama

parameters7.24B

quantizationQ4_K_M

4.4GB

params

{ "num_ctx": 32768, "stop": [ "<|im_start|>", "<|im_end|>" ] }

75B

template

{{ if .System }}{{ .System }} {{ end }}{{ if .Prompt }}USER: {{ .Prompt }} {{ end }}ASSISTANT: {{ .R

112B

system

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful,

156B

Readme

WizardLM-2-7B

Model Quants	Size	Bit	Perplexity
wizardlm2-7b:Q4_0	4.1GB	4	+0.2166 ppl
wizardlm2-7b:Q4_K_M	4.4GB	4	+0.0532 ppl
wizardlm2-7b:Q5_K_M	5.1GB	5	+0.0122 ppl
wizardlm2-7b:Q6_K	5.9GB	6	+0.0008 ppl

Config

max_position_embeddings : 32768
rope_theta : 500000.0
vocab_size : 32000

Remarks

‘latest’ model points to Q4_0
modelfile has 32768 num_ctx activated (Ollama default only 2048)