Donut Base Finetuned Zhtrainticket
Donut model fine-tuned on ZhTrainTicket for document image-to-text conversion without OCR processing.
Downloads 362
Release Time : 7/19/2022
Model Overview
Donut is a vision-language model composed of a Swin Transformer encoder and BART decoder, capable of directly extracting text information from images.
Model Features
OCR-free Processing
Understands document images directly through visual encoder, eliminating traditional OCR preprocessing steps
End-to-End Training
Joint training of visual encoder and text decoder for end-to-end document understanding
Chinese Receipt Recognition
Specially optimized for Chinese train tickets and other receipts
Model Capabilities
Document Image Understanding
Visual Text Extraction
Receipt Information Recognition
Use Cases
Receipt Processing
Train Ticket Information Extraction
Automatically extracts train number, date, fare and other information from Chinese train ticket images
Document Digitization
Document Content Extraction
Converts scanned documents into structured text data
Featured Recommended AI Models
Š 2025AIbase