38 Downloads Updated yesterday
State-of-the-art OCR (Optical Character Recognition) vision language model based on allenai/olmOCR-2-7B-1025.
This model excels at extracting text from: - Documents and PDFs - Handwritten notes - Tables and spreadsheets - Charts and graphs - Mathematical expressions - Screenshots and images
Base Model: Qwen2.5-VL-7B-Instruct (fine-tuned for OCR) Quantization: Q8_0 (8-bit, high quality) Size: 8.85 GB Performance: 82.4 points on olmOCR-Bench
ollama pull richardyoung/olmocr2:7b-q8
ollama run richardyoung/olmocr2:7b-q8 "Extract all text from this image." image.png
ollama run richardyoung/olmocr2:7b-q8 "Extract all text from this document, preserving formatting and structure." document.jpg
ollama run richardyoung/olmocr2:7b-q8 "Transcribe the handwritten text in this image." handwriting.jpg
ollama run richardyoung/olmocr2:7b-q8 "Extract the table data and format it as markdown." table.png
ollama run richardyoung/olmocr2:7b-q8 "Extract all mathematical equations from this image in LaTeX format." math.png
import ollama
response = ollama.chat(
model='richardyoung/olmocr2:7b-q8',
messages=[{
'role': 'user',
'content': 'Extract all text from this image.',
'images': ['document.png']
}]
)
print(response['message']['content'])
Minimum: - 10 GB RAM - 9 GB free disk space
Recommended: - 16 GB RAM - Apple Silicon (M1/M2/M3/M4) or NVIDIA GPU - Metal or CUDA support for GPU acceleration
--num-ctx 8192If you need different formats or sizes:
Apache 2.0 (same as base model)
@article{olmocr2,
title={olmOCR-2: Advancing OCR with Vision Language Models},
author={Allen Institute for AI},
year={2025}
}