vicuna:7b-v1.5-16k-q3_K

vicuna:7b-v1.5-16k-q3_K_L

1.2M Downloads Updated 2 years ago

General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.

7b 13b 33b

ollama run vicuna:7b-v1.5-16k-q3_K_L

curl http://localhost:11434/api/chat \
  -d '{
    "model": "vicuna:7b-v1.5-16k-q3_K_L",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='vicuna:7b-v1.5-16k-q3_K_L',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'vicuna:7b-v1.5-16k-q3_K_L',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 2 years ago

2 years ago

6e9eee2fdf07 · 3.6GB ·

model

archllama

parameters6.74B

quantizationQ3_K_L

3.6GB

system

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful,

155B

params

{ "num_ctx": 16384, "rope_frequency_scale": 0.125, "stop": [ "USER:", "A

76B

template

{{ .System }} USER: {{ .Prompt }} ASSISTANT:

45B

Readme

Vicuna is a chat assistant model. It includes 3 different variants in 3 different sizes. v1.3 is trained by fine-tuning Llama and has a context size of 2048 tokens. v1.5 is trained by fine-tuning Llama 2 and has a context size of 2048 tokens. v1.5-16k is trained by fine-tuning Llama 2 and has a context size of 16k tokens. All three variants are trained using conversations collected from ShareGPT.

Example prompts

What is the meaning of life? Explain it in 5 paragraphs.

References

HuggingFace