76 11 months ago

4 million long context window model built on Llama-3.1.

ollama run tukia/nvidia-ultralong-4M

Models

View all →

Readme

Nemotron-UltraLong-8B from https://huggingface.co/nvidia/Llama-3.1-Nemotron-8B-UltraLong-4M-Instruct.

Context window size of 4 million tokens.