An extension of Mistral to support context windows of 64K or 128K.
7b
43.7K Pulls Updated 14 months ago
Updated 14 months ago
14 months ago
9f1a8b3eec09 · 3.2GB
model
archllama
·
parameters7.24B
·
quantizationQ3_K_S
3.2GB
params
{
"num_ctx": 65536
}
17B
Readme
Yarn Mistral is a model based on Mistral that extends its context size up to 128k context. It is developed by Nous Research by implementing the YaRN method to further train the model to support larger context windows.
CLI
64k context size:
ollama run yarn-mistral
128k context size:
ollama run yarn-mistral:7b-128k
API
Example:
curl -X POST http://localhost:11434/api/generate -d '{
"model": "yarn-mistral:7b-128k",
"prompt":"Here is a story about llamas eating grass"
}'
References
YaRN: Efficient Context Window Extension of Large Language Models