eas/dragon-yi-v0:6b-q8

eas/ dragon-yi-v0:6b-q8_0

174 Downloads Updated 2 years ago

Yi model fine-tuned for RAG by llmware

ollama run eas/dragon-yi-v0:6b-q8_0

curl http://localhost:11434/api/chat \
  -d '{
    "model": "eas/dragon-yi-v0:6b-q8_0",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='eas/dragon-yi-v0:6b-q8_0',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'eas/dragon-yi-v0:6b-q8_0',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 2 years ago

2 years ago

90c7536b2c89 · 6.4GB ·

model

archllama

parameters6.06B

quantizationQ8_0

6.4GB

params

{ "num_ctx": 4096 }

17B

template

<human>: {{ .Prompt }} <bot>:

29B

Readme

dragon-yi-6b-v0 part of the dRAGon (“Delivering RAG On …”) model series, RAG-instruct trained on top of a Yi-6B base model.

DRAGON models have been fine-tuned with the specific objective of fact-based question-answering over complex business and legal documents with an emphasis on reducing hallucinations and providing short, clear answers for workflow automation.

Note: These models are tuned for RAG, not free-form chat. For an example of the kinds of questions you can ask, see this benchmark.

I’ve done minimal work on the modelfile, other than making sure it seems to work. One thing I’ve noticed in my minimal testing is that the default (q4_K_M) quantization seems less good at its job than the q6_K and q8_0 variants.

For more information, see /llmware/dragon-yi-6b-v0 on Hugging Face