nuroai/avalon-2b

nuroai/ avalon-2b

9 Downloads Updated 4 weeks ago

AVALON-2B: The First Sub-3B Self-Reflective Language

ollama run nuroai/avalon-2b

curl http://localhost:11434/api/chat \
  -d '{
    "model": "nuroai/avalon-2b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='nuroai/avalon-2b',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'nuroai/avalon-2b',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Models

View all →

Name

1 model

Size

Context

Input

avalon-2b:latest

1.6GB · 256K context window · Text · 4 weeks ago

avalon-2b:latest

1.6GB

256K

Text

Readme

AVALON-2B: The First Sub-3B Self-Reflective Language Model

The first sub-3B parameter model with Self-RAG (Self-Reflective Retrieval-Augmented Generation). Generates
[Retrieval], [No Retrieval], and [Utility:X] tokens to indicate when external knowledge is needed.

Key Features:
- 1.88B parameters (Q4_K_M quantized to 1.5GB)
- 82.5% Self-RAG token accuracy
- 62.04% MMLU (beats Gemma 4 E2B)
- 40+ tok/s on Apple M3
- Runs on iPhone (12 tok/s on A17 Pro)

Links:
- HuggingFace: https://huggingface.co/nuroai/Avalon-2B
- GitHub: https://github.com/nuro-labs/avalon-2b
- Paper: https://github.com/nuro-labs/avalon-2b/blob/main/AVALON_2B_Paper.pdf

Built by Nuro AI Labs (UK) - https://nuro.one

AVALON-2B: The First Sub-3B Self-Reflective Language Model                                                            
                                                                                                                        
  The first sub-3B parameter model with Self-RAG (Self-Reflective Retrieval-Augmented Generation). Generates            
  [Retrieval], [No Retrieval], and [Utility:X] tokens to indicate when external knowledge is needed.                    
                                                                                                                        
  Key Features:                                               
  - 1.88B parameters (Q4_K_M quantized to 1.5GB)              
  - 82.5% Self-RAG token accuracy                                                                                       
  - 62.04% MMLU (beats Gemma 4 E2B)                                                                                     
  - 40+ tok/s on Apple M3                                                                                               
  - Runs on iPhone (12 tok/s on A17 Pro)                                                                                
                                                                                                                        
  Links:                                                                                                                
  - HuggingFace: https://huggingface.co/nuroai/Avalon-2B                                                                
  - GitHub: https://github.com/nuro-labs/avalon-2b                                                                      
  - Paper: https://github.com/nuro-labs/avalon-2b/blob/main/AVALON_2B_Paper.pdf
                                                                                                                        
  Built by Nuro AI Labs (UK) - https://nuro.one

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)