Donut Base Finetuned Cord V2
Donut is a visual document understanding model based on Swin Transformer, specifically fine-tuned for the CORD dataset, capable of extracting structured text information from images.
Downloads 32
Release Time : 9/5/2023
Model Overview
This model is a visual document understanding model based on the Donut architecture, fine-tuned on the CORD dataset, capable of processing document images and extracting structured text information from them.
Model Features
Visual Document Understanding
Capable of extracting structured text information from document images, suitable for various document processing scenarios.
Based on Swin Transformer
Utilizes the advanced Swin Transformer architecture, featuring powerful visual feature extraction capabilities.
Web-Compatible
Converted to ONNX format, usable on the web via Transformers.js.
Model Capabilities
Document image processing
Structured text extraction
Visual feature recognition
Use Cases
Document Processing
Receipt Information Extraction
Automatically extracts structured information such as merchant, amount, and date from receipt images
Improves data entry efficiency and reduces manual processing
Form Recognition
Identifies fields and content in various forms
Enables automated processing of form data
Featured Recommended AI Models
Š 2025AIbase