Qari-OCR-0.3 Open-source Model - Free to deploy and directly recognize Arabic text in images

Qari OCR 0.3 SNAPSHOT VL 2B Instruct Merged

Developed by NAMAA-Space

A vision-language model designed specifically for Arabic optical character recognition (OCR), capable of directly recognizing Arabic text in images.

Image-to-Text

Transformers

#Arabic OCR #Multimodal large model #Image text recognition

Downloads 467

Release Time : 4/10/2025

Model Overview

This model is fine-tuned based on Qwen2-VL-2B-Instruct and is specifically used for Arabic optical character recognition tasks, providing an efficient image text recognition solution.

Model Features

Arabic-specific OCR

Optimized for Arabic character recognition, providing high-precision recognition capabilities.

Vision-Language Model

Combines visual and language understanding capabilities to directly recognize text from images.

Efficient Solution

Provides a fast and accurate text recognition solution for the Arabic processing field.

Model Capabilities

Arabic image text recognition

Multimodal text understanding

High-precision OCR

Use Cases

Document digitization

Arabic document scanning

Convert paper Arabic documents into editable electronic text

High-fidelity text conversion

Image text extraction

Arabic image text recognition

Extract text content from natural images containing Arabic text

Accurate text recognition results

🚀 Model Card for Model ID

This model is designed for Arabic Optical Character Recognition (OCR), aiming to accurately recognize Arabic text in images, which is of great value for processing Arabic - related visual materials.

🚀 Quick Start

Use the code below to get started with the model. [More Information Needed]

✨ Features

This is a Vision - Language Model for OCR, specifically tailored for Arabic text recognition.
It is fine - tuned from the Qwen2 - VL - 2B - Instruct model.

📚 Documentation

Model Details

Model Description

This is the model card of a 🤖 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: Ahmed Wasfy, Omer Nacar, Abdelakreem Elkhateb, Mahmoud Reda, Omar Elshehy, Adel Ammar, Wadii Boulila
Model type: Vision - Language Model for OCR
Language(s) (NLP): Arabic
Finetuned from model: Qwen2 - VL - 2B - Instruct

Model Sources

Paper: QARI - OCR: High - Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation

Uses

Direct Use

This model can be directly used for recognizing Arabic text in images.

Out - of - Scope Use

This model is specifically designed for Arabic text and might not perform well on other languages.

Bias, Risks, and Limitations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

Training Details

Training Data

Trained on specialized synthetic datasets.

Evaluation

The evaluation protocols and results are yet to be fully provided.

Model Examination

Relevant interpretability work for the model is yet to be provided.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Citation

BibTeX:

@misc{QariOCR2025,
  title={QARI - OCR: High - Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation},
  author={Ahmed Wasfy, Omer Nacar, Abdelakreem Elkhateb, Mahmoud Reda, Omar Elshehy, Adel Ammar, Wadii Boulila},
  year={2025},
  archivePrefix={arXiv},
  url={https://arxiv.org/abs/2506.02295},
  note={Accessed: 2025 - 03 - 03}
}

Property	Details
Model Type	Vision - Language Model for OCR
Language(s) (NLP)	Arabic
Finetuned from model	Qwen2 - VL - 2B - Instruct
Training Data	Trained on specialized synthetic datasets

⚠️ Important Note

This model is specifically designed for Arabic text and might not perform well on other languages.

💡 Usage Tip

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご