757 10 months ago

long context 10 million token context window size llama 4 scout model

vision tools
ollama run tukia/llama-4-Scout-17b-16e-Instruct-q4_K_M

Details

10 months ago

f5e81ab317cf · 67GB ·

llama4
·
109B
·
Q4_K_M
{ "num_ctx": 4096, "temperature": 0.1, "top_p": 0.9 }
{{- if .System }}<|header_start|>system<|header_end|> {{- with .Tools }}Environment: ipython You hav
You are a helpful and intelligent AI assistant. For every problem you solve, always explain your rea

Readme

downloaded from https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct converted and quantized using v0.6.7-rc1

  • Updated system prompt to use chain of thought
  • 1.8TB of memory is required for 10M size. You could swap to disk …..