9 Downloads Updated 4 weeks ago
ollama run nuroai/avalon-2b
Updated 4 weeks ago
4 weeks ago
4e601921476d · 1.6GB ·
AVALON-2B: The First Sub-3B Self-Reflective Language Model
The first sub-3B parameter model with Self-RAG (Self-Reflective Retrieval-Augmented Generation). Generates
[Retrieval], [No Retrieval], and [Utility:X] tokens to indicate when external knowledge is needed.
Key Features:
- 1.88B parameters (Q4_K_M quantized to 1.5GB)
- 82.5% Self-RAG token accuracy
- 62.04% MMLU (beats Gemma 4 E2B)
- 40+ tok/s on Apple M3
- Runs on iPhone (12 tok/s on A17 Pro)
Links:
- HuggingFace: https://huggingface.co/nuroai/Avalon-2B
- GitHub: https://github.com/nuro-labs/avalon-2b
- Paper: https://github.com/nuro-labs/avalon-2b/blob/main/AVALON_2B_Paper.pdf
Built by Nuro AI Labs (UK) - https://nuro.one