D

Donut Base Finetuned Cord V2

Developed by Xenova
Donut is a visual document understanding model based on Swin Transformer, specifically fine-tuned for the CORD dataset, capable of extracting structured text information from images.
Downloads 32
Release Time : 9/5/2023

Model Overview

This model is a visual document understanding model based on the Donut architecture, fine-tuned on the CORD dataset, capable of processing document images and extracting structured text information from them.

Model Features

Visual Document Understanding
Capable of extracting structured text information from document images, suitable for various document processing scenarios.
Based on Swin Transformer
Utilizes the advanced Swin Transformer architecture, featuring powerful visual feature extraction capabilities.
Web-Compatible
Converted to ONNX format, usable on the web via Transformers.js.

Model Capabilities

Document image processing
Structured text extraction
Visual feature recognition

Use Cases

Document Processing
Receipt Information Extraction
Automatically extracts structured information such as merchant, amount, and date from receipt images
Improves data entry efficiency and reduces manual processing
Form Recognition
Identifies fields and content in various forms
Enables automated processing of form data
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase