Open-source OCR - Donut - CORD Model: Free Deployment to Aid Receipt Document Understanding without OCR Technology!

OCR Donut CORD

Developed by jinhybr

Donut is an OCR-free document understanding model based on Swin Transformer visual encoder and BART text decoder, this version is fine-tuned on CORD receipt dataset

Image-to-Text

Transformers

Open Source License:MIT #OCR-free document understanding #Receipt text parsing #Swin-BART architecture

Downloads 1,130

Release Time : 11/4/2022

Model Overview

OCR-free document understanding Transformer model that can directly extract and comprehend text content from images, particularly suitable for receipt parsing tasks

Model Features

OCR-free processing

Eliminates traditional OCR preprocessing, directly understands text content from images

End-to-end training

Joint training of visual encoder and text decoder optimizes overall performance

Document understanding capability

Not only recognizes text but also understands document structure and semantic relationships

Model Capabilities

Image-to-text

Document parsing

Receipt information extraction

Vision-language understanding

Use Cases

Business document processing

Receipt information extraction

Automatically extracts key information like merchant, amount, date from receipt images

Excellent performance on CORD dataset

Invoice processing

Automatically parses structured data from invoices

Document digitization

Paper document conversion

Converts scanned paper documents into structured digital format

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

OCR Donut CORD

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Donut (base-sized model, fine-tuned on CORD)

🚀 Quick Start

✨ Features

📚 Documentation

Intended uses & limitations

CORD Dataset

📄 License