olmOCR-7B-0725-FP8 Open-source Document OCR Model - Free Deployment for Accurate Document Content Recognition

Olmocr 7B 0725 FP8

Developed by allenai

olmOCR-7B-0725-FP8 is a document OCR model based on the Qwen2.5-VL-7B-Instruct model. It is fine-tuned using the olmOCR-mix-0225 dataset and then quantized to the FP8 version.

Image-to-Text

Transformers

EnglishOpen Source License:Apache-2.0 #Document Image OCR #FP8 Quantization #Large-scale Document Processing

Downloads 881

Release Time : 7/22/2025

Model Overview

This model focuses on text recognition in document images. It can process document images containing text and extract the text content.

Model Features

FP8 Quantization

Use the llmcompressor tool to quantize the original model to the FP8 version, improving inference efficiency

Document OCR Optimization

Specifically optimized for document images, capable of accurately recognizing text content in documents

Large-scale Processing Capability

Supports efficient inference through sglang, suitable for large-scale application scenarios involving millions of documents

Model Capabilities

Document Image Text Recognition

Multi-language Text Extraction

Large-scale Document Processing

Use Cases

Document Digitization

Historical Document Digitization

Convert paper historical documents into searchable digital text

Enterprise Document Processing

Automatically process a large number of enterprise contracts, reports and other documents

Educational Research

Academic Paper Analysis

Extract text content from scanned academic papers for analysis

Property	Details
Model Type	Quantized to FP8 Version of olmOCR-7B-0725
Training Data	allenai/olmOCR-mix-0225
Base Model	Qwen/Qwen2.5-VL-7B-Instruct
Library Name	transformers

Featured Recommended AI Models

Qwen2.5 VL 7B Abliterated Caption It I1 GGUF

Apache-2.0

Quantized version of Qwen2.5-VL-7B-Abliterated-Caption-it, supporting multilingual image description tasks.

Image-to-Text

Transformers Supports Multiple Languages

mradermacher

167

Nunchaku Flux.1 Dev Colossus

Other

The Nunchaku quantized version of the Colossus Project Flux, designed to generate high-quality images based on text prompts. This model minimizes performance loss while optimizing inference efficiency.

Image Generation English

nunchaku-tech

235

Qwen2.5 VL 7B Abliterated Caption It GGUF

Apache-2.0

This is a static quantized version based on the Qwen2.5-VL-7B model, focusing on image captioning generation tasks and supporting multiple languages.

Image-to-Text

Transformers Supports Multiple Languages

olmOCR-7B-0725-FP8 is a document OCR model based on the Qwen2.5-VL-7B-Instruct model. It is fine-tuned using the olmOCR-mix-0225 dataset and then quantized to the FP8 version.

Lucy-128k is a model developed based on Qwen3-1.7B, focusing on proxy-based web search and lightweight browsing, and can run efficiently on mobile devices.

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Olmocr 7B 0725 FP8

Model Introduction

Content Details

Alternatives

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 olmOCR-7B-0725-FP8

🚀 Quick Start

💻 Usage Examples

Basic Usage

📄 License

Featured Recommended AI Models