court-records-htr開源手寫文字識別模型 - 免費識別19世紀芬蘭語瑞典語法庭記錄

首頁

Court Records Htr

由Kansallisarkisto開發

基於微軟TrOCR微調的手寫文字識別模型，專用於19世紀芬蘭語和瑞典語法庭記錄文檔

文字識別

PyTorch

開源協議:MIT #歷史手寫體識別 #芬蘭語瑞典語OCR #法庭檔案數字化

下載量 24

發布時間 : 9/12/2024

模型概述

該模型用於從文本行圖像中識別手寫文字，特別針對19世紀芬蘭語和瑞典語的數字化法庭記錄文檔進行了優化訓練。

模型特點

歷史文檔專項優化

專門針對19世紀手寫體特點進行訓練，在歷史文檔識別任務上表現優異

多語言支持

同時支持芬蘭語和瑞典語的手寫識別

高精度識別

在驗證集上達到2.4%的字錯誤率和11.3%的詞錯誤率

模型能力

手寫文字識別

歷史文檔處理

多語言文本提取

使用案例

歷史檔案數字化

法庭記錄轉錄

將19世紀手寫法庭記錄轉換為可搜索的數字文本

實現高精度自動轉錄，字錯誤率僅2.4%

家譜研究

歷史人口記錄處理

自動識別歷史人口登記簿中的手寫信息

🚀 芬蘭19世紀法庭記錄手寫文本識別模型

該模型用於從文本行圖像中進行手寫文本識別。它通過對微軟的TrOCR模型進行微調訓練，使用了數字化的19世紀芬蘭語和瑞典語法庭記錄文檔。

🚀 快速開始

此模型可按以下代碼預測圖像中的文本內容。若有可用的GPU，建議在推理時使用。

from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
import torch

# Use GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Model location in Huggingface Hub
model_checkpoint = "Kansallisarkisto/court-records-htr"
# Path to textline image
line_image_path = "/path/to/textline_image.jpg"

# Initialize processor and model
processor = TrOCRProcessor.from_pretrained(model_checkpoint)
model = VisionEncoderDecoderModel.from_pretrained(model_checkpoint).to(device)

# Open image file and extract pixel values
image = Image.open(line_image_path).convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values

# Use the model to generate predictions 
generated_ids = model.generate(pixel_values.to(device))
# Use the processor to decode ids to text
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)