Olmocr 7B 0725 FP8
O

Olmocr 7B 0725 FP8

Developed by allenai
olmOCR-7B-0725-FP8 is a document OCR model based on the Qwen2.5-VL-7B-Instruct model. It is fine-tuned using the olmOCR-mix-0225 dataset and then quantized to the FP8 version.
Downloads 881
Release Time : 7/22/2025

Model Overview

This model focuses on text recognition in document images. It can process document images containing text and extract the text content.

Model Features

FP8 Quantization
Use the llmcompressor tool to quantize the original model to the FP8 version, improving inference efficiency
Document OCR Optimization
Specifically optimized for document images, capable of accurately recognizing text content in documents
Large-scale Processing Capability
Supports efficient inference through sglang, suitable for large-scale application scenarios involving millions of documents

Model Capabilities

Document Image Text Recognition
Multi-language Text Extraction
Large-scale Document Processing

Use Cases

Document Digitization
Historical Document Digitization
Convert paper historical documents into searchable digital text
Enterprise Document Processing
Automatically process a large number of enterprise contracts, reports and other documents
Educational Research
Academic Paper Analysis
Extract text content from scanned academic papers for analysis
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase