A

Arabic Base Nougat

Developed by MohamedRashad
An end-to-end structured optical character recognition (OCR) system specifically designed for Arabic, fine-tuned based on the facebook/nougat-base architecture
Downloads 130
Release Time : 10/13/2024

Model Overview

This model is an end-to-end structured OCR system for Arabic books, capable of converting Arabic book page images into structured text, particularly suitable for scenarios requiring Markdown format.

Model Features

Arabic OCR Optimization
Specially optimized for Arabic text, capable of accurately recognizing complex layouts and characters in Arabic book pages
Structured Output
Supports generating structured text output in Markdown format, preserving the original document's formatting information
End-to-End Processing
Directly processes from image input to text output without intermediate steps

Model Capabilities

Arabic Text Recognition
English Text Recognition
Book Page Image Processing
Markdown Format Generation

Use Cases

Literature Digitization
Digitization of Ancient Arabic Texts
Converting printed ancient Arabic texts into editable digital text
Structured text preserving original layout and formatting
Education
Textbook Content Extraction
Extracting teaching content from scanned Arabic textbooks
Editable textbook text, facilitating the creation of e-textbooks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase