9 4 weeks ago

AVALON-2B: The First Sub-3B Self-Reflective Language

ollama run nuroai/avalon-2b

Models

View all →

Readme

AVALON-2B: The First Sub-3B Self-Reflective Language Model

The first sub-3B parameter model with Self-RAG (Self-Reflective Retrieval-Augmented Generation). Generates
[Retrieval], [No Retrieval], and [Utility:X] tokens to indicate when external knowledge is needed.

Key Features:
- 1.88B parameters (Q4_K_M quantized to 1.5GB)
- 82.5% Self-RAG token accuracy
- 62.04% MMLU (beats Gemma 4 E2B)
- 40+ tok/s on Apple M3
- Runs on iPhone (12 tok/s on A17 Pro)

Links:
- HuggingFace: https://huggingface.co/nuroai/Avalon-2B
- GitHub: https://github.com/nuro-labs/avalon-2b
- Paper: https://github.com/nuro-labs/avalon-2b/blob/main/AVALON_2B_Paper.pdf

Built by Nuro AI Labs (UK) - https://nuro.one