đ trocr-base-printed-synthetic_dataset_ocr
This model is a fine - tuned version of microsoft/trocr-base-printed on an unknown dataset, which can be used for image - to - text tasks.
đ Quick Start
This model is a fine - tuned version of microsoft/trocr-base-printed on an unknown dataset.
⨠Features
This model could be used to read labels with printed text.
đ Documentation
Model description
Here is the link to my code for this model: https://github.com/DunnBC22/Vision_Audio_and_Multimodal_Projects/tree/main/Optical%20Character%20Recognition%20(OCR)/20%2C000%20Synthetic%20Samples%20Dataset
Intended uses & limitations
This model could be used to read labels with printed text.
Training and evaluation data
Here is the link to the dataset that I used for this model: https://www.kaggle.com/datasets/ravi02516/20k-synthetic-ocr-dataset
Character Length for Training Dataset:
/20%2C000%20Synthetic%20Samples%20Dataset/Images/Input%20Characgter%20Length%20Distribution%20for%20Training%20Dataset.png)
Character Length for Evaluation Dataset:
/20%2C000%20Synthetic%20Samples%20Dataset/Images/Input%20Characgter%20Length%20Distribution%20for%20Evaluation%20Dataset.png)
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e - 05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- num_epochs: 1
- mixed_precision_training: Native AMP
Training results
CER = 0.003 (Actually, 0.002896524170994806)
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1+cu116
- Datasets 2.10.1
- Tokenizers 0.13.2
Model Checkpoint
@misc{li2021trocr, title={TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models}, author={Minghao Li and Tengchao Lv and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei}, year={2021}, eprint={2109.10282}, archivePrefix={arXiv}, primaryClass={cs.CL}}
Metric (Character Error Rate [CER])
@inproceedings{morris2004, author = {Morris, Andrew and Maier, Viktoria and Green, Phil}, year = {2004}, month = {01}, pages = {}, title = {From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition.} }
â ī¸ Important Note
Please make sure to give proper credit to the owner(s) of the data and developers of the model (microsoft/trocr-base-printed).
đ License
No license information is provided in the original document, so this section is skipped.
đ§ Technical Details
No specific technical details beyond the training hyperparameters and framework versions are provided. Since the content is relatively brief and does not meet the requirement of more than 50 - word specific technical description, this section is skipped.
đģ Usage Examples
No code examples are provided in the original document, so this section is skipped.
đĻ Installation
No installation steps are provided in the original document, so this section is skipped.