Donut-finetune-rvl-cdip Open-source Document Classification Model - Accurately Classify Documents Based on Small-scale Datasets

Home

Donut Finetune Rvl Cdip

Developed by sitloboi2012

Document classification model based on the Donut framework, trained on a small-scale RVL-CDIP dataset

Image-to-Text

Transformers

EnglishOpen Source License:Apache-2.0 #End-to-end document classification #Few-shot training #Image-to-text conversion

Downloads 18

Release Time : 9/30/2023

Model Overview

This model adopts the Donut framework and VisionEncoderDecoder architecture, specifically designed for end-to-end document classification tasks, suitable for processing English document images.

Model Features

End-to-end document classification

Directly processes image inputs and outputs classification results without separate OCR steps

Small-scale dataset training

Trained on a 100-image subset of RVL-CDIP, suitable for rapid validation and benchmarking

Based on the Donut framework

Utilizes advanced vision-language model architecture for document AI tasks

Model Capabilities

Document image classification

English document processing

End-to-end image-to-text conversion

Use Cases

Document management

Food document classification

Automatically identifies and categorizes food-related documents

Financial document processing

Classifies financial documents such as invoices and receipts

Property	Details
Model Type	Baseline model for using RVL - CDIP with Donut
Training Data	sitloboi2012/rvl_cdip_small_dataset, sitloboi2012/rvl_cdip_large_dataset
Library Name	transformers
Pipeline Tag	image - to - text
Tags	DocumentAI, ImageClassification, Donut
Metrics	accuracy
License	Apache 2.0

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Donut Finetune Rvl Cdip

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Model Card for Model ID

🚀 Quick Start

✨ Features

📦 Installation

💻 Usage Examples

📚 Documentation

Model Details

Downstream Use [optional]

🔧 Technical Details

📄 License

Information Table