Donut Finetune Rvl Cdip
D
Donut Finetune Rvl Cdip
Developed by sitloboi2012
Document classification model based on the Donut framework, trained on a small-scale RVL-CDIP dataset
Downloads 18
Release Time : 9/30/2023
Model Overview
This model adopts the Donut framework and VisionEncoderDecoder architecture, specifically designed for end-to-end document classification tasks, suitable for processing English document images.
Model Features
End-to-end document classification
Directly processes image inputs and outputs classification results without separate OCR steps
Small-scale dataset training
Trained on a 100-image subset of RVL-CDIP, suitable for rapid validation and benchmarking
Based on the Donut framework
Utilizes advanced vision-language model architecture for document AI tasks
Model Capabilities
Document image classification
English document processing
End-to-end image-to-text conversion
Use Cases
Document management
Food document classification
Automatically identifies and categorizes food-related documents
Financial document processing
Classifies financial documents such as invoices and receipts
Featured Recommended AI Models
Š 2025AIbase