Fixed num_ctx to 8192 and eos token. This Llama 3 8B Instruct model is ready to use for full model's 8k contexts window.
397 Pulls Updated 6 months ago
Updated 6 months ago
6 months ago
aae5b523ef30 · 4.9GB
model
archllama
·
parameters8.03B
·
quantizationQ4_K_M
4.9GB
system
You are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests
129B
params
{"num_ctx":8192,"num_keep":24,"stop":["\u003c|start_header_id|\u003e","\u003c|end_header_id|\u003e",
129B
template
{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .
257B
Readme
Meta-Llama-3-8B-Instruct
Model Quants | Size | Bit | Perplexity |
---|---|---|---|
llama3-8b-instruct:Q4_0 | 4.7GB | 4 | +0.2166 ppl |
llama3-8b-instruct:Q4_K_M | 4.9GB | 4 | +0.0532 ppl |
llama3-8b-instruct:Q5_K_M | 5.7GB | 5 | +0.0122 ppl |
llama3-8b-instruct:Q6_K | 6.6GB | 6 | +0.0008 ppl |
Config
“max_position_embeddings” : 8192
“rope_theta” : 500000.0
“vocab_size” : 128256
Remarks
- ‘latest’ model points to Q4_0
- modelfile has 8192 num_ctx activated (Ollama default only 2048)
- fixed eos token, no more repetitive response