Donut-base-finetuned-zhtrainticket Open Source Model - Achieve Document Image to Text Conversion without OCR Processing

Home

Donut Base Finetuned Zhtrainticket

Developed by naver-clova-ix

Donut model fine-tuned on ZhTrainTicket for document image-to-text conversion without OCR processing.

Image-to-Text

Transformers

Open Source License:MIT #OCR-free Document Parsing #Train Ticket Information Extraction #Swin-BART Architecture

Downloads 362

Release Time : 7/19/2022

Model Overview

Donut is a vision-language model composed of a Swin Transformer encoder and BART decoder, capable of directly extracting text information from images.

Model Features

OCR-free Processing

Understands document images directly through visual encoder, eliminating traditional OCR preprocessing steps

End-to-End Training

Joint training of visual encoder and text decoder for end-to-end document understanding

Chinese Receipt Recognition

Specially optimized for Chinese train tickets and other receipts

Model Capabilities

Document Image Understanding

Visual Text Extraction

Receipt Information Recognition

Use Cases

Receipt Processing

Train Ticket Information Extraction

Automatically extracts train number, date, fare and other information from Chinese train ticket images

Document Digitization

Document Content Extraction

Converts scanned documents into structured text data

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Donut Base Finetuned Zhtrainticket

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Donut (base-sized model, fine-tuned on ZhTrainTicket)

🚀 Quick Start

✨ Features

📚 Documentation

Intended uses & limitations

BibTeX entry and citation info

📄 License