đ Model Card for Model ID
This model is designed for Arabic Optical Character Recognition (OCR), aiming to accurately recognize Arabic text in images, which is of great value for processing Arabic - related visual materials.
đ Quick Start
Use the code below to get started with the model.
[More Information Needed]
⨠Features
- This is a Vision - Language Model for OCR, specifically tailored for Arabic text recognition.
- It is fine - tuned from the Qwen2 - VL - 2B - Instruct model.
đ Documentation
Model Details
Model Description
This is the model card of a đ¤ transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: Ahmed Wasfy, Omer Nacar, Abdelakreem Elkhateb, Mahmoud Reda, Omar Elshehy, Adel Ammar, Wadii Boulila
- Model type: Vision - Language Model for OCR
- Language(s) (NLP): Arabic
- Finetuned from model: Qwen2 - VL - 2B - Instruct
Model Sources
Uses
Direct Use
This model can be directly used for recognizing Arabic text in images.
Out - of - Scope Use
This model is specifically designed for Arabic text and might not perform well on other languages.
Bias, Risks, and Limitations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
Training Details
Training Data
Trained on specialized synthetic datasets.
Evaluation
The evaluation protocols and results are yet to be fully provided.
Model Examination
Relevant interpretability work for the model is yet to be provided.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
Citation
BibTeX:
@misc{QariOCR2025,
title={QARI - OCR: High - Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation},
author={Ahmed Wasfy, Omer Nacar, Abdelakreem Elkhateb, Mahmoud Reda, Omar Elshehy, Adel Ammar, Wadii Boulila},
year={2025},
archivePrefix={arXiv},
url={https://arxiv.org/abs/2506.02295},
note={Accessed: 2025 - 03 - 03}
}
Property |
Details |
Model Type |
Vision - Language Model for OCR |
Language(s) (NLP) |
Arabic |
Finetuned from model |
Qwen2 - VL - 2B - Instruct |
Training Data |
Trained on specialized synthetic datasets |
â ī¸ Important Note
This model is specifically designed for Arabic text and might not perform well on other languages.
đĄ Usage Tip
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.