U

Uae License Detection

Developed by codedrainer
Donut is an OCR-free document understanding Transformer model that combines a visual encoder and text decoder to process document images
Downloads 21
Release Time : 7/22/2023

Model Overview

A document understanding model based on Swin Transformer visual encoder and BART text decoder, capable of generating text directly from images without OCR preprocessing

Model Features

OCR-free Processing
Directly processes document images without traditional OCR preprocessing steps
End-to-End Training
Joint training of visual encoder and text decoder enables end-to-end document understanding
Multimodal Architecture
Combines Swin Transformer's visual processing capabilities with BART's text generation capabilities

Model Capabilities

Document Image Classification
Image-to-Text Conversion
Document Content Understanding

Use Cases

Document Processing
Document Classification
Automatically classify types of scanned documents (e.g., invoices, contracts)
Document Content Extraction
Extract structured text information from document images
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase