The open-source model of marian-finetuned-multidataset-kin-to-en - Enabling efficient translation from Kinyarwanda to English

Home

Marian Finetuned Multidataset Kin To En

Developed by RogerB

This model is a fine-tuned machine translation model for Kinyarwanda to English, based on Helsinki-NLP/opus-mt-rw-en

Machine Translation

Transformers

Open Source License:Apache-2.0 #Kinyarwanda translation #Multi-dataset fine-tuning #Low-resource optimization

Downloads 64

Release Time : 8/10/2023

Model Overview

A machine translation model fine-tuned with multiple datasets, specifically designed for translating Kinyarwanda to English, primarily for research purposes

Model Features

Multi-dataset fine-tuning

Trained by integrating multiple professional translation datasets including Digital Umuganda, Masakhane, and Flores200

Preprocessing optimization

Standardized text processing such as lowercase conversion and number removal before training

Research-grade performance

Achieves a BLEU score of 36.27 on validation sets, suitable for academic research

Model Capabilities

Kinyarwanda to English text translation

Cross-language semantic conversion

Use Cases

Academic research

African language machine translation research

Used for comparative studies of low-resource language translation algorithms

Provides benchmark performance of 36.27 BLEU

Language services

Basic document translation

Handles preliminary translation of simple Kinyarwanda documents

🚀 marian-finetuned-multidataset-kin-to-en

This is a fine - tuned model for Kinyarwanda to English machine translation, achieving good results on the evaluation set.

🚀 Quick Start

This model is a fine - tuned version of [Helsinki - NLP/opus - mt - rw - en](https://huggingface.co/Helsinki - NLP/opus - mt - rw - en) on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.7550
Bleu: 36.2717

✨ Features

The model has been fine - tuned to perform machine translation from Kinyarwanda to English.
The primary intended use of this model is for research purposes.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples are provided in the original document, so this section is skipped.

📚 Documentation

Model Description

The model has been fine - tuned to perform machine translation from Kinyarwanda to English.

Intended Uses & Limitations

The primary intended use of this model is for research purposes.

Training and Evaluation Data

The model was fine - tuned using a combination of datasets from the following sources:

[Digital Umuganda](https://huggingface.co/datasets/DigitalUmuganda/kinyarwanda - english - machine - translation - dataset/tree/main)
[Masakhane](https://huggingface.co/datasets/masakhane/mafand/viewer/en - kin/validation)
Muennighoff

For the training of the machine translation model, the dataset underwent the following preprocessing steps:

Text was converted to lowercase
Digits were removed

The combined dataset was divided into training and validation sets, with a split of 90% for training and 10% for validation.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e - 05
train_batch_size: 32
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
num_epochs: 3

Training results

No specific training results are provided in the original document, so details are skipped here.

Framework versions

Transformers 4.31.0
Pytorch 2.0.1+cu118
Datasets 2.14.4
Tokenizers 0.13.3

🔧 Technical Details

The model is a fine - tuned version of [Helsinki - NLP/opus - mt - rw - en](https://huggingface.co/Helsinki - NLP/opus - mt - rw - en), fine - tuned on multiple datasets for Kinyarwanda - to - English machine translation. The pre - processing of the dataset and the training hyperparameters are described in the documentation section.

📄 License

The model is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご