Thai Trocr Thaigov V2
T
Thai Trocr Thaigov V2
Developed by kkatiz
A Thai handwritten recognition model based on vision encoder-decoder architecture, suitable for various Thai OCR tasks
Downloads 339
Release Time : 3/8/2024
Model Overview
This model adopts the TrOCR architecture, combining a pretrained vision encoder with a Thai language decoder, specifically designed for recognizing Thai handwritten text. Fine-tuned on 250,000 synthetic text images, it is suitable for Thai OCR scenarios such as government documents.
Model Features
Hybrid Pretraining Architecture
Encoder uses microsoft/trocr-base-handwritten pretrained model, decoder uses airesearch/wangchanberta-base-att-spm-uncased model
Large-scale Thai Data Fine-tuning
Fine-tuned on 250,000 synthetic text images from the Thai Government V2 corpus
Synthetic Data Augmentation
Uses SynthTIGER to generate high-quality synthetic text images, improving model generalization
Model Capabilities
Thai Handwritten Recognition
Image-to-text
Document OCR Processing
Use Cases
Government Document Processing
Government Document Recognition
Automatically recognize handwritten content in Thai government documents
Example recognition result: 'รมว.ธรรมนัส ลงพื้นที่'
Education Sector
Student Handwritten Assignment Grading
Recognize content in Thai students' handwritten assignments
Featured Recommended AI Models