batiai/ nemotron3-nano-omni-text:iq4

59 Downloads Updated 2 months ago

ollama run batiai/nemotron3-nano-omni-text:iq4

curl http://localhost:11434/api/chat \
  -d '{
    "model": "batiai/nemotron3-nano-omni-text:iq4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='batiai/nemotron3-nano-omni-text:iq4',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'batiai/nemotron3-nano-omni-text:iq4',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 2 months ago

2 months ago

4bcf36cc7080 · 18GB ·

model

archnemotron_h_moe

·

parameters31.6B

·

quantizationIQ4_XS

18GB

system

You are a helpful AI assistant.

31B

params

{ "num_ctx": 131072, "stop": [ "<extra_id_1>", "<|endoftext|>" ], "t

97B

Readme

Nemotron 3 Nano Omni — Text-Only — Quantized by BatiAI

Reasoning-tuned text backbone extracted from NVIDIA’s Nemotron 3 Nano Omni multimodal model. NemotronH MoE 30B-A3B (Mamba+Attention hybrid).

Vision and audio encoders are stripped from this GGUF (llama.cpp doesn’t yet support the Omni multimodal architecture). Watching for upstream support — multimodal release coming when ready.

Models

Tag	Size	RAM target	Use Case
iq3	17GB	24GB Mac	Compact reasoning
iq4	17GB	32GB Mac	Recommended
q5	25GB	36GB+ Mac	Highest quality

Quick Start

ollama run batiai/nemotron3-nano-omni-text:iq4

Why This Model?

Reasoning-tuned — Omni variant trained for step-by-step reasoning
30B-A3B MoE — only 3B active per token, fits Macs from 24GB up
Hybrid Mamba+Attention — efficient long-context
NVIDIA Open Model License — commercial-friendly

Nano vs Nano Omni (text)

Same NemotronH MoE 30B-A3B backbone, but Omni is reasoning-focused:

	nemotron3-nano	nemotron3-nano-omni-text
Tuning	General agentic	Reasoning-focused
Step-by-step	Standard	Stronger
Tool calling	✅	✅

RAM Requirements

Your Mac RAM	iq3 (17GB)	iq4 (17GB)	q5 (25GB)
16GB	⚠️ Heavy swap	⚠️ Heavy swap	❌
24GB	✅	✅	❌
32GB	✅ Fast	✅ Recommended	⚠️ Tight
36GB	✅	✅	✅
48GB+	✅	✅	✅ Headroom

For Other Macs

Your Mac	Recommended
16GB	`batiai/gemma4-e4b:q4`
24GB	`batiai/nemotron3-nano-omni-text:iq3` (this)
32GB	`batiai/nemotron3-nano-omni-text:iq4` (this, recommended)
48GB	`batiai/nemotron3-nano-omni-text:q5`
128GB	`batiai/minimax-m2.7:iq3` (229B Dense frontier)

Why BatiAI?

Custom text-only extraction from Omni multimodal source
imatrix calibrated (200 chunks wikitext-2)
Verified on real Mac hardware
BatiAI signed (general.author=BatiAI)

Built for BatiFlow

Free, on-device AI automation for Mac. 5MB app, 100% local, unlimited.

https://flow.bati.ai

# Nemotron 3 Nano Omni — Text-Only — Quantized by BatiAI

**Reasoning-tuned text backbone** extracted from NVIDIA's Nemotron 3 Nano Omni multimodal model. NemotronH MoE 30B-A3B (Mamba+Attention hybrid).

> Vision and audio encoders are stripped from this GGUF (llama.cpp doesn't yet support the Omni multimodal architecture). Watching for upstream support — multimodal release coming when ready.

## Models

| Tag | Size | RAM target | Use Case |
|-----|------|------------|----------|
| **iq3** | **17GB** | 24GB Mac | Compact reasoning |
| **iq4** | **17GB** | 32GB Mac | **Recommended** |
| **q5** | **25GB** | 36GB+ Mac | Highest quality |

## Quick Start

```
ollama run batiai/nemotron3-nano-omni-text:iq4
```

## Why This Model?

- **Reasoning-tuned** — Omni variant trained for step-by-step reasoning
- **30B-A3B MoE** — only 3B active per token, fits Macs from 24GB up
- **Hybrid Mamba+Attention** — efficient long-context
- NVIDIA Open Model License — commercial-friendly

## Nano vs Nano Omni (text)

Same NemotronH MoE 30B-A3B backbone, but Omni is reasoning-focused:

| | nemotron3-nano | nemotron3-nano-omni-text |
|---|---|---|
| Tuning | General agentic | **Reasoning-focused** |
| Step-by-step | Standard | **Stronger** |
| Tool calling | ✅ | ✅ |

## RAM Requirements

| Your Mac RAM | iq3 (17GB) | iq4 (17GB) | q5 (25GB) |
|-------------|-----------|-----------|----------|
| 16GB | ⚠️ Heavy swap | ⚠️ Heavy swap | ❌ |
| **24GB** | **✅** | **✅** | ❌ |
| **32GB** | **✅ Fast** | **✅ Recommended** | ⚠️ Tight |
| **36GB** | ✅ | ✅ | **✅** |
| 48GB+ | ✅ | ✅ | ✅ Headroom |

## For Other Macs

| Your Mac | Recommended |
|----------|-------------|
| 16GB | `batiai/gemma4-e4b:q4` |
| 24GB | `batiai/nemotron3-nano-omni-text:iq3` (this) |
| 32GB | `batiai/nemotron3-nano-omni-text:iq4` (this, recommended) |
| 48GB | `batiai/nemotron3-nano-omni-text:q5` |
| 128GB | `batiai/minimax-m2.7:iq3` (229B Dense frontier) |

## Why BatiAI?

- Custom text-only extraction from Omni multimodal source
- imatrix calibrated (200 chunks wikitext-2)
- Verified on real Mac hardware
- BatiAI signed (`general.author=BatiAI`)

## Built for BatiFlow

Free, on-device AI automation for Mac. 5MB app, 100% local, unlimited.

https://flow.bati.ai

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)