D

Donut Base Finetuned Zhtrainticket

Developed by naver-clova-ix
Donut model fine-tuned on ZhTrainTicket for document image-to-text conversion without OCR processing.
Downloads 362
Release Time : 7/19/2022

Model Overview

Donut is a vision-language model composed of a Swin Transformer encoder and BART decoder, capable of directly extracting text information from images.

Model Features

OCR-free Processing
Understands document images directly through visual encoder, eliminating traditional OCR preprocessing steps
End-to-End Training
Joint training of visual encoder and text decoder for end-to-end document understanding
Chinese Receipt Recognition
Specially optimized for Chinese train tickets and other receipts

Model Capabilities

Document Image Understanding
Visual Text Extraction
Receipt Information Recognition

Use Cases

Receipt Processing
Train Ticket Information Extraction
Automatically extracts train number, date, fare and other information from Chinese train ticket images
Document Digitization
Document Content Extraction
Converts scanned documents into structured text data
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase