Ko Trocr
An OCR model supporting Korean initial sound recognition, using an improved tokenizer to address the traditional TrOCR's shortcomings in Korean initial sound recognition
Downloads 2,035
Release Time : 3/9/2023
Model Overview
A Korean optical character recognition model optimized based on the TrOCR architecture, specifically addressing Korean initial sound recognition issues, suitable for digitizing Korean documents
Model Features
Korean Initial Sound Support
Uses a special tokenizer decoder to ensure Korean initial sounds are not displayed as UNK unknown characters
Professional Competition Validation
Technical solution validated by the 2023 Kyowon Group AI OCR Challenge
High-Quality Training Data
Trained using professional Korean OCR datasets from the AI Hub platform
Model Capabilities
Korean text recognition
Printed text extraction
Document digitization processing
Use Cases
Document Processing
Public Administrative Document Digitization
Convert paper administrative documents into editable electronic text
Accurately recognizes official documents containing complex Korean characters
Printed Material Transcription
Extract Korean text from books, magazines, and other printed materials
Featured Recommended AI Models
Š 2025AIbase