T

Trocr Large Str

Developed by microsoft
TrOCR is a Transformer-based optical character recognition model designed for single-line text images, fine-tuned on multiple standard datasets.
Downloads 571
Release Time : 9/8/2022

Model Overview

The TrOCR model combines an image Transformer encoder and a text Transformer decoder, enabling efficient text recognition from images.

Model Features

Transformer-based architecture
Utilizes advanced Transformer architecture with combined image and text processing capabilities
Multi-dataset fine-tuning
Fine-tuned on multiple standard datasets including IC13, IC15, IIIT5K, and SVT
Pre-trained model initialization
Image encoder initialized with BEiT and text decoder initialized with RoBERTa

Model Capabilities

Single-line text image recognition
Optical Character Recognition
Image-to-text conversion

Use Cases

Document digitization
Scanned document recognition
Convert scanned paper documents into editable text
High-accuracy text conversion
Scene text recognition
Street view text recognition
Recognize street signs and advertisement text in photos
Capable of recognizing text in various fonts and backgrounds
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase