O

Olmocr 7B Thai V1

Developed by Adun
olmOCR is an optical character recognition model fine-tuned based on Qwen2-VL-7B-Instruct. It focuses on converting image content such as PDFs into text and improves the recognition accuracy in specific scenarios through fine-tuning.
Downloads 1,730
Release Time : 4/19/2025

Model Overview

olmOCR is an optical character recognition (OCR) model that can convert the content in images such as PDF files into text (TEXT). It further enhances the recognition accuracy and performance in specific scenarios through fine-tuning.

Model Features

Highly customizable
Through fine-tuning, the model can be customized according to different business needs and scenarios.
Open-source sharing
It provides model weights, fine-tuning datasets, and inference code, facilitating developers for secondary development and research.
A large amount of fine-tuning data
Based on the Vision Language Model, 250K fine-tuning has been carried out, enabling the model to have better generalization ability.
Multi-interface support
It supports two usage methods, API and CLI, and the model can be called through the command line or API (such as vLLM, SGlang).

Model Capabilities

Image to text conversion
PDF content extraction
OCR optimization for specific scenarios

Use Cases

Document digitization
PDF to text
Convert scanned PDF documents into editable text content.
Improve document processing efficiency and searchability.
Business automation
Invoice recognition
Automatically recognize and extract key information from invoices.
Reduce manual input errors and improve processing speed.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase