yarn-mistral:7b-64k-q3_K

yarn-mistral:7b-64k-q3_K_S

862.8K Downloads Updated 2 years ago

An extension of Mistral to support context windows of 64K or 128K.

ollama run yarn-mistral:7b-64k-q3_K_S

curl http://localhost:11434/api/chat \
  -d '{
    "model": "yarn-mistral:7b-64k-q3_K_S",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='yarn-mistral:7b-64k-q3_K_S',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'yarn-mistral:7b-64k-q3_K_S',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 2 years ago

2 years ago

9f1a8b3eec09 · 3.2GB ·

model

archllama

parameters7.24B

quantizationQ3_K_S

3.2GB

params

{ "num_ctx": 65536 }

17B

Readme

Yarn Mistral is a model based on Mistral that extends its context size up to 128k context. It is developed by Nous Research by implementing the YaRN method to further train the model to support larger context windows.

CLI

64k context size:

ollama run yarn-mistral

128k context size:

ollama run yarn-mistral:7b-128k

API

Example:

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "yarn-mistral:7b-128k",
  "prompt":"Here is a story about llamas eating grass"
 }'

References

Hugging Face

YaRN: Efficient Context Window Extension of Large Language Models