D

Donut Receipts Extract

Developed by AdamCodd
A specialized receipt text extraction model based on the Donut architecture, achieving OCR-free document understanding through visual encoder and text decoder
Downloads 66
Release Time : 1/28/2024

Model Overview

This model is specifically designed for extracting structured text information from receipt images, utilizing Swin Transformer as the visual encoder and BART as the text decoder architecture, supporting end-to-end receipt information recognition and extraction.

Model Features

OCR-Free Document Understanding
Directly processes image inputs and extracts text information without traditional OCR preprocessing steps
Dual-Resolution Processing
V2 version uses double resolution for receipt images, significantly improving recognition accuracy
Structured Output
Automatically generates structured data in JSON format, including key receipt fields (e.g., amount, phone number, discount)
Improved Dataset
Trained on a deduplicated and manually corrected dataset, showing significant performance improvements over V1

Model Capabilities

Receipt Image Recognition
Text Information Extraction
Structured Data Generation
Multi-Field Joint Parsing

Use Cases

Retail & Finance
Electronic Receipt Archiving
Automatically extracts key information such as amount and date from paper receipts
89.5% accuracy, 15.8% character error rate
Expense Reimbursement System
Recognizes receipt images submitted by employees and automatically fills reimbursement forms
Supports extraction of 12 key fields including <s_total> and <s_date>
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase