๐ marian-finetuned-multidataset-kin-to-en
This is a fine - tuned model for Kinyarwanda to English machine translation, achieving good results on the evaluation set.
๐ Quick Start
This model is a fine - tuned version of [Helsinki - NLP/opus - mt - rw - en](https://huggingface.co/Helsinki - NLP/opus - mt - rw - en) on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.7550
- Bleu: 36.2717
โจ Features
- The model has been fine - tuned to perform machine translation from Kinyarwanda to English.
- The primary intended use of this model is for research purposes.
๐ฆ Installation
No installation steps are provided in the original document, so this section is skipped.
๐ป Usage Examples
No code examples are provided in the original document, so this section is skipped.
๐ Documentation
Model Description
The model has been fine - tuned to perform machine translation from Kinyarwanda to English.
Intended Uses & Limitations
The primary intended use of this model is for research purposes.
Training and Evaluation Data
The model was fine - tuned using a combination of datasets from the following sources:
- [Digital Umuganda](https://huggingface.co/datasets/DigitalUmuganda/kinyarwanda - english - machine - translation - dataset/tree/main)
- [Masakhane](https://huggingface.co/datasets/masakhane/mafand/viewer/en - kin/validation)
- Muennighoff
For the training of the machine translation model, the dataset underwent the following preprocessing steps:
- Text was converted to lowercase
- Digits were removed
The combined dataset was divided into training and validation sets, with a split of 90% for training and 10% for validation.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e - 05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- num_epochs: 3
Training results
No specific training results are provided in the original document, so details are skipped here.
Framework versions
- Transformers 4.31.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.4
- Tokenizers 0.13.3
๐ง Technical Details
The model is a fine - tuned version of [Helsinki - NLP/opus - mt - rw - en](https://huggingface.co/Helsinki - NLP/opus - mt - rw - en), fine - tuned on multiple datasets for Kinyarwanda - to - English machine translation. The pre - processing of the dataset and the training hyperparameters are described in the documentation section.
๐ License
The model is licensed under the Apache 2.0 license.