ExpedientFalcon/Qwen3-4B-UD-Q5_K

ExpedientFalcon/ Qwen3-4B-UD-Q5_K_XL:latest

447.1K Downloads Updated 1 year ago

Qwen3-4B Q5_K_XL Unsloth UD 2.0 adaptively quantized model, much better for coding than vanilla Q4_K_M quants without taking up the VWAM of an 8-bit Q8_0 model. From https://huggingface.co/unsloth/Qwen3-4B-GGUF/tree/main

tools

ollama run ExpedientFalcon/Qwen3-4B-UD-Q5_K_XL

curl http://localhost:11434/api/chat \
  -d '{
    "model": "ExpedientFalcon/Qwen3-4B-UD-Q5_K_XL",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='ExpedientFalcon/Qwen3-4B-UD-Q5_K_XL',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'ExpedientFalcon/Qwen3-4B-UD-Q5_K_XL',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 year ago

1 year ago

f5bc81c7b7e3 · 2.9GB ·

model

archqwen3

parameters4.02B

quantizationQ5_K_M

2.9GB

params

{ "min_p": 0, "num_ctx": 40960, "num_predict": 16384, "repeat_penalty": 1, "stop

166B

template

{{- if .Messages }} {{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{

1.5kB

Readme

No readme