Donut Pdf Ocr
D
Donut Pdf Ocr
Developed by shubh1608
OCR model trained on image folder datasets for text recognition in PDF documents
Downloads 67
Release Time : 4/17/2023
Model Overview
This model is an Optical Character Recognition (OCR) model specifically designed to extract text content from PDF document images. It achieves high-precision text recognition through deep learning technology.
Model Features
High-precision OCR
Achieved a low loss value of 0.0443 on the evaluation set, indicating high recognition accuracy.
End-to-End Training
The model adopts an end-to-end training approach, directly outputting text from images.
PDF Document Optimization
Specifically optimized for training on PDF document images.
Model Capabilities
PDF Document Image Text Recognition
Multi-format Text Output
Document Structure Analysis
Use Cases
Document Digitization
PDF Document Conversion
Convert scanned PDF documents into editable text formats.
Highly accurate text conversion.
Office Automation
Document Information Extraction
Automatically extract key information from contracts, invoices, and other documents.
Improved data processing efficiency.
Featured Recommended AI Models