M

Multicentury Htr Model

Developed by Kansallisarkisto
A Transformer-based handwritten text recognition model, specifically designed for Swedish and Finnish, suitable for historical document digitization.
Downloads 39
Release Time : 10/7/2024

Model Overview

This model is a fine-tuned version of microsoft/trocr-large-handwritten, focusing on recognizing handwritten texts from the 17th to 20th centuries, supporting document digitization and handwritten note transcription.

Model Features

Multi-century Handwriting Support
Training data includes handwriting samples from the 17th to 20th centuries, adapting to diverse writing styles.
Nordic Language Optimization
Specially optimized for special characters in Finnish and Swedish (e.g., å, ä, ö).
High Accuracy Recognition
Achieves a character error rate (CER) of 3.2 on the test set, demonstrating excellent performance.

Model Capabilities

Handwritten Text Recognition
Historical Document Transcription
Table Data Extraction

Use Cases

Archive Digitization
Historical Manuscript Transcription
Convert historical handwritten documents in archives into searchable digital text.
CER 3.2 (test set of 94,900 lines of text)
Personal Applications
Handwritten Note Transcription
Convert personal handwritten notes into electronic text format.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase