OCR Donut CORD
Donut is an OCR-free document understanding model based on Swin Transformer visual encoder and BART text decoder, this version is fine-tuned on CORD receipt dataset
Downloads 1,130
Release Time : 11/4/2022
Model Overview
OCR-free document understanding Transformer model that can directly extract and comprehend text content from images, particularly suitable for receipt parsing tasks
Model Features
OCR-free processing
Eliminates traditional OCR preprocessing, directly understands text content from images
End-to-end training
Joint training of visual encoder and text decoder optimizes overall performance
Document understanding capability
Not only recognizes text but also understands document structure and semantic relationships
Model Capabilities
Image-to-text
Document parsing
Receipt information extraction
Vision-language understanding
Use Cases
Business document processing
Receipt information extraction
Automatically extracts key information like merchant, amount, date from receipt images
Excellent performance on CORD dataset
Invoice processing
Automatically parses structured data from invoices
Document digitization
Paper document conversion
Converts scanned paper documents into structured digital format
Featured Recommended AI Models
Š 2025AIbase