T

Trocr Base Str

Developed by microsoft
TrOCR is a Transformer-based optical character recognition model, specifically designed for single-line text image recognition, fine-tuned on multiple standard datasets.
Downloads 692
Release Time : 9/8/2022

Model Overview

This model adopts an encoder-decoder architecture, combining BEiT image encoder and RoBERTa text decoder, suitable for various text recognition tasks in different scenarios.

Model Features

Transformer-based OCR
Utilizes advanced Transformer architecture for visual text recognition tasks, combining computer vision and natural language processing technologies.
Pre-trained model fine-tuning
Image encoder is based on BEiT pre-training, text decoder is based on RoBERTa pre-training, with strong transfer learning capabilities.
Multi-dataset adaptation
Fine-tuned on multiple standard OCR datasets including IC13, IC15, IIIT5K, and SVT, ensuring broad applicability.

Model Capabilities

Single-line text image recognition
Scene text recognition
Printed text recognition
Handwritten text recognition (limited support)

Use Cases

Document digitization
Scanned document OCR
Convert scanned document images into editable text
High-accuracy text conversion
Scene text recognition
Street view text recognition
Recognize text on street signs and billboards in photos
Capable of handling text under various angles and lighting conditions
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase