trocr-base-handwritten_nj_biergarten_captcha_v2 Open-source Captcha OCR Model

Trocr Base Handwritten Nj Biergarten Captcha V2

Developed by phunc20

A fine-tuned CAPTCHA OCR model based on microsoft/trocr-base-handwritten, trained on the nj_biergarten_captcha_v2 dataset

Text Recognition

Transformers

Open Source License:Gpl-3.0 #CAPTCHA OCR #Handwriting Recognition #Low CER Rate

Downloads 24

Release Time : 2/4/2025

Model Overview

This model is specifically designed for CAPTCHA recognition tasks, capable of extracting text content from CAPTCHA images.

Model Features

High Accuracy CAPTCHA Recognition

Excellent performance on specific CAPTCHA datasets with extremely low CER (Character Error Rate)

Transformer-based Architecture

Utilizes the advanced TrOCR architecture, combining visual and language processing capabilities

Fine-tuning Optimization

Specially optimized for CAPTCHA tasks based on the foundational model

Model Capabilities

CAPTCHA Text Recognition

Image-to-Text Conversion

Character Recognition

Use Cases

Security Verification

CAPTCHA Auto-Recognition

Used for CAPTCHA recognition in automated testing or auxiliary tools

Achieves 99.2% exact match rate on test set

Data Collection

CAPTCHA Dataset Labeling

Assists in automatic labeling of CAPTCHA datasets

🚀 trocr-base-handwritten_nj_biergarten_captcha_v2

A model for CAPTCHA OCR, fine-tuned from microsoft/trocr-base-handwritten.

🚀 Quick Start

Use the code below to get started with the model:

import torch

if torch.cuda.is_available():
    device = torch.device("cuda")
else:
    device = torch.device("cpu")


from transformers import TrOCRProcessor, VisionEncoderDecoderModel

hub_dir = "phunc20/trocr-base-handwritten_nj_biergarten_captcha_v2"
processor = TrOCRProcessor.from_pretrained(hub_dir)
model = VisionEncoderDecoderModel.from_pretrained(hub_dir)
model = model.to(device)


from PIL import Image

image = Image.open("/path/to/image")
pixel_values = processor(image, return_tensors='pt').pixel_values
pixel_values = pixel_values.to(device)
outputs = model.generate(pixel_values)
pred_str = processor.batch_decode(outputs, skip_special_tokens=True)[0]

✨ Features

This is a simple model finetuned from microsoft/trocr-base-handwritten on a dataset phunc20/nj_biergarten_captcha_v2.

📚 Documentation

Model Details

This is a model for CAPTCHA OCR, which is fine - tuned from microsoft/trocr-base-handwritten on the dataset phunc20/nj_biergarten_captcha_v2.

Uses

Direct Use

The provided Python code demonstrates how to use the model for CAPTCHA OCR.

Bias, Risks, and Limitations

Although the model seems to perform well on the dataset phunc20/nj_biergarten_captcha_v2, it does not exhibit such good performance across all CAPTCHA images. In this respect, this model is worse than Human.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

Training Details

Training Data

The model was trained on the train split of phunc20/nj_biergarten_captcha_v2 and evaluated on the validation split, without using the test split.

Training Procedure

Please refer to https://gitlab.com/phunc20/captchew/-/blob/main/colab_notebooks/train_from_pretrained_Seq2SeqTrainer_torchDataset.ipynb?ref_type=heads, which is adapted from https://github.com/NielsRogge/Transformers-Tutorials/blob/master/TrOCR/Fine_tune_TrOCR_on_IAM_Handwriting_Database_using_Seq2SeqTrainer.ipynb.

Evaluation

Testing Data, Factors & Metrics

Testing Data

The test split of phunc20/nj_biergarten_captcha_v2.
The Kaggle dataset https://www.kaggle.com/datasets/fournierp/captcha-version-2-images/data (referred to as kaggle_test_set in this model card).

Factors

[More Information Needed]

Metrics

CER, exact match and average length difference. The former two can be found in HuggingFace's documentation. The last one is a custom metric, and its explanation can be found at https://gitlab.com/phunc20/captchew/-/blob/v0.1/average_length_difference.py.

Results

On the test split of phunc20/nj_biergarten_captcha_v2:

Model	cer	exact match	avg len diff
`phunc20/trocr-base-handwritten_nj_biergarten_captcha_v2`	0.001333	496/500	1/500
`microsoft/trocr-base-handwritten`	0.9	5/500	2.4

On kaggle_test_set:

Model	cer	exact match	avg len diff
`phunc20/trocr-base-handwritten_nj_biergarten_captcha_v2`	0.4381	69/1070	0.1289
`microsoft/trocr-base-handwritten`	1.0112	17/1070	2.4439

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

📄 License

This project is licensed under the GPL - 3.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご