O

OCR Donut CORD

Developed by jinhybr
Donut is an OCR-free document understanding model based on Swin Transformer visual encoder and BART text decoder, this version is fine-tuned on CORD receipt dataset
Downloads 1,130
Release Time : 11/4/2022

Model Overview

OCR-free document understanding Transformer model that can directly extract and comprehend text content from images, particularly suitable for receipt parsing tasks

Model Features

OCR-free processing
Eliminates traditional OCR preprocessing, directly understands text content from images
End-to-end training
Joint training of visual encoder and text decoder optimizes overall performance
Document understanding capability
Not only recognizes text but also understands document structure and semantic relationships

Model Capabilities

Image-to-text
Document parsing
Receipt information extraction
Vision-language understanding

Use Cases

Business document processing
Receipt information extraction
Automatically extracts key information like merchant, amount, date from receipt images
Excellent performance on CORD dataset
Invoice processing
Automatically parses structured data from invoices
Document digitization
Paper document conversion
Converts scanned paper documents into structured digital format
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase