76 11 months ago

4 million long context window model built on Llama-3.1.

ollama run tukia/nvidia-ultralong-4M

Details

11 months ago

dc3c87b57e07 · 16GB ·

llama
·
8.04B
·
F16
{ "stop": [ "<|start_header_id|>", "<|end_header_id|>", "<|eot_id|>"
{{- range .Messages }}<|start_header_id|>{{ .Role }}<|end_header_id|> {{ .Content }}<|eot_id|> {{- e

Readme

Nemotron-UltraLong-8B from https://huggingface.co/nvidia/Llama-3.1-Nemotron-8B-UltraLong-4M-Instruct.

Context window size of 4 million tokens.