AI-translator-eng-to-9ja Open-source Translation Model - Translate English into Yoruba, Igbo, and Hausa for Free

AI Translator Eng To 9ja

Developed by HelpMumHQ

This is a translation model with 418 million parameters, specifically built for translating English into Yoruba, Igbo, and Hausa.

Machine Translation

Transformers

Supports Multiple LanguagesOpen Source License:MIT #English - Nigerian language translation #Multilingual medical translation #Low-resource language support

Downloads 122

Release Time : 9/22/2024

Model Overview

This model aims to provide high-quality English translation services for Yoruba, Igbo, and Hausa, helping users communicate with large language models more easily in these languages.

Model Features

Multilingual support

Supports translation from English to three major Nigerian languages: Yoruba, Igbo, and Hausa.

High-quality translation

Trained on a dataset containing 1.5 million sentences, it can provide high-quality translation results.

Large model scale

With 418 million parameters, it has strong language understanding and generation capabilities.

Model Capabilities

English to Yoruba translation

English to Igbo translation

English to Hausa translation

Use Cases

Language communication

Communicate with large language models

Help users who speak Yoruba, Igbo, and Hausa communicate with large language models more easily.

Content localization

Medical content translation

Translate English medical-related content into local languages to improve the accessibility of medical information.

🚀 AI-translator-eng-to-9ja

This is a 418-million parameter translation model designed to translate from English into Yoruba, Igbo, and Hausa. Trained on a dataset of 1,500,000 sentences (500,000 for each language), it offers high-quality translations for these languages. The model aims to build a system that simplifies communication with LLMs using Igbo, Hausa, and Yoruba.

✨ Features

Multilingual Translation: Translates from English to Yoruba, Igbo, and Hausa.
High-Quality Output: Trained on a large dataset for accurate translations.
LLM Communication: Facilitates communication with LLMs in local languages.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer

import huggingface_hub
huggingface_hub.login()


# Load the fine-tuned model
model = M2M100ForConditionalGeneration.from_pretrained("HelpMumHQ/AI-translator-eng-to-9ja")
tokenizer = M2M100Tokenizer.from_pretrained("HelpMumHQ/AI-translator-eng-to-9ja")

# translate English to Igbo
eng_text="Healthcare is an important field in virtually every society because it directly affects the well-being and quality of life of individuals. It encompasses a wide range of services and professions, including preventive care, diagnosis, treatment, and management of diseases and conditions."
tokenizer.src_lang = "en"
tokenizer.tgt_lang = "ig"
encoded_eng = tokenizer(eng_text, return_tensors="pt")
generated_tokens = model.generate(**encoded_eng, forced_bos_token_id=tokenizer.get_lang_id("ig"))
tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)


# translate English to yoruba
eng_text="Healthcare is an important field in virtually every society because it directly affects the well-being and quality of life of individuals. It encompasses a wide range of services and professions, including preventive care, diagnosis, treatment, and management of diseases and conditions. Effective healthcare systems aim to improve health outcomes, reduce the incidence of illness, and ensure that individuals have access to necessary medical services."
tokenizer.src_lang = "en"
tokenizer.tgt_lang = "yo"
encoded_eng = tokenizer(eng_text, return_tensors="pt")
generated_tokens = model.generate(**encoded_eng, forced_bos_token_id=tokenizer.get_lang_id("yo"))
tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)

# translate English to Hausa
eng_text="Healthcare is an important field in virtually every society because it directly affects the well-being and quality of life of individuals. It encompasses a wide range of services and professions, including preventive care, diagnosis, treatment, and management of diseases and conditions. Effective healthcare systems aim to improve health outcomes, reduce the incidence of illness, and ensure that individuals have access to necessary medical services."
tokenizer.src_lang = "en"
tokenizer.tgt_lang = "ha"
encoded_eng = tokenizer(eng_text, return_tensors="pt")
generated_tokens = model.generate(**encoded_eng, forced_bos_token_id=tokenizer.get_lang_id("ha"))
tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)

📚 Documentation

Model Details

Property	Details
Languages Supported	Source Language: English; Target Languages: Yoruba, Igbo, Hausa
Model Type	418 Million parameter translation model
Training Data	A dataset of 1,500,000 translation pairs from open - source parallel corpora and curated datasets for Yoruba, Igbo, and Hausa

Supported Language Codes

English: en
Yoruba: yo
Igbo: ig
Hausa: ha

Training Dataset

The training dataset consists of 1,500,000 translation pairs, sourced from a combination of open - source parallel corpora and curated datasets specific to Yoruba, Igbo, and Hausa.

Limitations

Performance may vary depending on the complexity and domain of the text, even though the model performs well in English - to - Yoruba, Igbo, and Hausa translations.
Translation quality may decline for extremely long sentences or ambiguous contexts.

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e - 05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
num_epochs: 3

Framework Versions

Transformers 4.44.2
Pytorch 2.4.0+cu121
Datasets 2.21.0
Tokenizers 0.19.1

📄 License

This model is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご