tb-etl/mineru-q4km

tb-etl/ mineru-q4km:latest

153 Downloads Updated 4 weeks ago

MinerU 2.5 Pro (1.2B) - Q4_K_M An advanced document parsing vision-language model (VLM)

ollama run tb-etl/mineru-q4km

curl http://localhost:11434/api/chat \
  -d '{
    "model": "tb-etl/mineru-q4km",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='tb-etl/mineru-q4km',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'tb-etl/mineru-q4km',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 4 weeks ago

4 weeks ago

c347c8d9bd13 · 398MB ·

model

archqwen2vl

parameters494M

quantizationQ4_K_M

398MB

Readme

MinerU 2.5 Pro (1.2B) - Q4_K_M

An advanced document parsing vision-language model (VLM) specialized in converting PDFs, charts, formulas, and Office documents into structured formats like Markdown and JSON.

Key Features

Architecture: A 1.2B parameter Vision-Language Model optimized for spatial and structural document analysis. Top Performance: Scores exceptionally high in document parsing benchmarks, including state-of-the-art accuracy in dense formula recognition, table parsing, and text extraction.

Advanced Document Understanding:

Chart and image parsing
Complex table merging across split pages
Dense formula translation to clean LaTeX
Image recognition within tables

Quantization Details

Type: Q4_K_M Balanced Profile: Uses Q6_K for half the attention and feed-forward tensor layers, and Q4_K for the rest, offering the absolute best balance of speed, low memory footprints, and near-native accuracy.