K

Ko Trocr

Developed by ddobokki
An OCR model supporting Korean initial sound recognition, using an improved tokenizer to address the traditional TrOCR's shortcomings in Korean initial sound recognition
Downloads 2,035
Release Time : 3/9/2023

Model Overview

A Korean optical character recognition model optimized based on the TrOCR architecture, specifically addressing Korean initial sound recognition issues, suitable for digitizing Korean documents

Model Features

Korean Initial Sound Support
Uses a special tokenizer decoder to ensure Korean initial sounds are not displayed as UNK unknown characters
Professional Competition Validation
Technical solution validated by the 2023 Kyowon Group AI OCR Challenge
High-Quality Training Data
Trained using professional Korean OCR datasets from the AI Hub platform

Model Capabilities

Korean text recognition
Printed text extraction
Document digitization processing

Use Cases

Document Processing
Public Administrative Document Digitization
Convert paper administrative documents into editable electronic text
Accurately recognizes official documents containing complex Korean characters
Printed Material Transcription
Extract Korean text from books, magazines, and other printed materials
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase