T

Trocr Small Spanish

Developed by qantev
Spanish printed text OCR model optimized based on Transformer architecture, does not support handwriting recognition
Downloads 270
Release Time : 2/22/2024

Model Overview

The TrOCR small model is optimized for Spanish printed text recognition, using a visual Transformer encoder and text Transformer decoder architecture, fine-tuned on a self-built dataset

Model Features

Spanish-Specific Optimization
Trained on a self-built dataset of 2 million Spanish samples, optimized for printed character recognition
Efficient Architecture Design
Uses a visual Transformer encoder to extract visual features and a text Transformer decoder to generate sequences, enabling end-to-end recognition
Real-Time Data Augmentation
Dynamically generates augmented images during training, significantly improving efficiency compared to pre-stored image solutions

Model Capabilities

Printed Text Recognition
Spanish Text Extraction
Sentence-Level OCR Processing
Image-to-Text Conversion

Use Cases

Document Digitization
Wikipedia Content Extraction
Extract text content from Spanish Wikipedia page images
Character Error Rate 6.32% (large model)
Form Processing
XFUND Dataset Processing
Text recognition for Spanish form documents
Significantly outperforms EasyOCR (CER reduced by 12.84%)
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase