yasserrmd/Nanonets-OCR2-3B

yasserrmd/ Nanonets-OCR2-3B:latest

1,827 Downloads Updated 3 months ago

vision

ollama run yasserrmd/Nanonets-OCR2-3B

curl http://localhost:11434/api/chat \
  -d '{
    "model": "yasserrmd/Nanonets-OCR2-3B",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='yasserrmd/Nanonets-OCR2-3B',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'yasserrmd/Nanonets-OCR2-3B',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 3 months ago

3 months ago

5eea018fd62e · 4.1GB ·

model

archqwen2vl

·

parameters3.09B

·

quantizationQ8_0

3.3GB

projector

archclip

·

parameters669M

·

quantizationQ8_0

848MB

template

{{- if .System -}} <|im_start|>system {{ .System }}<|im_end|> {{- end -}} {{- range $i, $_ := .Messa

487B

params

{ "temperature": 0.0001 }

22B

Readme

yasserrmd/Nanonets-OCR2-3B (8-bit)

Base model: nanonets/Nanonets-OCR2-3B Type: Multimodal OCR & document understanding (images → structured text, tables, LaTeX, captions). Precision: 8-bit quantized for efficient inference. Params: ~3B Format: GGUF / Ollama compatible

⚙️ Usage (Ollama)

ollama pull yasserrmd/Nanonets-OCR2-3B:q8_0
ollama run yasserrmd/Nanonets-OCR2-3B:q8_0

Example prompt:

Extract all text, tables, and equations from the uploaded document image.
Return tables in HTML and equations in LaTeX.

You can also use it via API:

import requests
requests.post("http://localhost:11434/api/generate",
              json={"model":"yasserrmd/Nanonets-OCR2-3B:q8_0",
                    "prompt":"<your prompt here>"})

📘 Notes

Original documentation, evaluation, and architecture: Hugging Face Model Page →
Use high-resolution input images for better OCR accuracy.
Quantization improves performance with minimal quality loss.
Best suited for document parsing, forms, and scanned PDFs.

##  yasserrmd/Nanonets-OCR2-3B (8-bit)

**Base model:** [nanonets/Nanonets-OCR2-3B](https://huggingface.co/nanonets/Nanonets-OCR2-3B)
**Type:** Multimodal OCR & document understanding (images → structured text, tables, LaTeX, captions).
**Precision:** 8-bit quantized for efficient inference.
**Params:** ~3B
**Format:** GGUF / Ollama compatible

---

### ⚙️ Usage (Ollama)

```bash
ollama pull yasserrmd/Nanonets-OCR2-3B:q8_0
ollama run yasserrmd/Nanonets-OCR2-3B:q8_0
```

**Example prompt:**

```
Extract all text, tables, and equations from the uploaded document image.
Return tables in HTML and equations in LaTeX.
```

You can also use it via API:

```python
import requests
requests.post("http://localhost:11434/api/generate",
              json={"model":"yasserrmd/Nanonets-OCR2-3B:q8_0",
                    "prompt":"<your prompt here>"})
```

---

### 📘 Notes

* Original documentation, evaluation, and architecture: [Hugging Face Model Page →](https://huggingface.co/nanonets/Nanonets-OCR2-3B)
* Use high-resolution input images for better OCR accuracy.
* Quantization improves performance with minimal quality loss.
* Best suited for document parsing, forms, and scanned PDFs.

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)