nllb-200-distilled-600M-wo-fr-en Open-source Model - Precise Bidirectional Translation between Wolof, French, and English

Nllb 200 Distilled 600M Wo Fr En

Developed by bilalfaye

This model is a fine-tuned multilingual translation model based on NLLB-200-distilled-600M, specifically optimized for bidirectional translation between Wolof, French, and English.

Machine Translation

Transformers

Supports Multiple LanguagesOpen Source License:MIT #Wolof translation #Multilingual translation #Low-resource optimization

Downloads 114

Release Time : 1/20/2025

Model Overview

The model supports bidirectional translation between Wolof, French, and English, including Wolof↔French, Wolof↔English, and French↔English translation tasks.

Model Features

Multilingual Bidirectional Translation

Supports six translation directions between Wolof, French, and English

Optimized Preprocessed Data

Fine-tuned using deeply preprocessed Wolof-French-English parallel corpora

Efficient Inference

Based on the distilled version of the NLLB model, improving inference efficiency while maintaining performance

Model Capabilities

Wolof to French translation

French to Wolof translation

English to Wolof translation

Wolof to English translation

French to English translation

English to French translation

Use Cases

Language Services

Cross-language Communication

Helps Wolof speakers communicate with French or English speakers

Achieves accurate and fluent translation for daily conversations

Document Translation

Converts official documents or educational materials between Wolof, French, and English

Maintains accuracy of professional terminology and contextual consistency

Education

Language Learning Assistance

Helps students learning Wolof, French, or English understand correspondences between different languages

Provides instant translation references to accelerate the language learning process

🚀 Bilalfaye's Translation Model

This is a fine - tuned translation model that supports multiple language pairs including French - Wolof, Wolof - French, English - Wolof, and more. It offers high - quality translation services with well - preprocessed datasets.

🚀 Quick Start

Manual Inference

First, install the required library:

!pip install transformers

Then, use the following Python code for translation:

from transformers import NllbTokenizer, AutoModelForSeq2SeqLM
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
model_load_name = 'bilalfaye/nllb-200-distilled-600M-wo-fr-en'

# Load model and tokenizer
model = AutoModelForSeq2SeqLM.from_pretrained(model_load_name).to(device)
tokenizer = NllbTokenizer.from_pretrained(model_load_name)

def translate(
    text, src_lang='wol_Latn', tgt_lang='french_Latn',
    a=32, b=3, max_input_length=1024, num_beams=4, **kwargs
):
    """Turn a text or a list of texts into a list of translations"""
    tokenizer.src_lang = src_lang
    tokenizer.tgt_lang = tgt_lang
    inputs = tokenizer(
        text, return_tensors='pt', padding=True, truncation=True,
        max_length=max_input_length
    )
    model.eval()
    result = model.generate(
        **inputs.to(model.device),
        forced_bos_token_id=tokenizer.convert_tokens_to_ids(tgt_lang),
        max_new_tokens=int(a + b * inputs.input_ids.shape[1]),
        num_beams=num_beams, **kwargs
    )
    return tokenizer.batch_decode(result, skip_special_tokens=True)

# Example usage
print(translate("Ndax mën nga ko waxaat su la neexee?", src_lang="wol_Latn", tgt_lang="french_Latn")[0])
print(translate("Ndax mën nga ko waxaat su la neexee?", src_lang="wol_Latn", tgt_lang="eng_Latn")[0])
print(translate("Bonjour, où allez-vous?", src_lang="fra_Latn", tgt_lang="wol_Latn")[0])
print(translate("Bonjour, où allez-vous?", src_lang="fra_Latn", tgt_lang="eng_Latn")[0])
print(translate("Hello, how are you?", src_lang="eng_Latn", tgt_lang="wol_Latn")[0])
print(translate("Hello, how are you?", src_lang="eng_Latn", tgt_lang="fr_Latn")[0])

Inference with Pipeline

Install the required library:

!pip install transformers

Use the following Python code with the pipeline:

from transformers import pipeline

model_name = 'bilalfaye/nllb-200-distilled-600M-wo-fr-en'
device = "cuda" if torch.cuda.is_available() else "cpu"

translator = pipeline("translation", model=model_name, device=device)

print(translator("Ndax mën nga ko waxaat su la neexee?", src_lang="wol_Latn", tgt_lang="fra_Latn")[0]['translation_text'])
print(translator("Bonjour, où allez-vous?", src_lang="fra_Latn", tgt_lang="wol_Latn")[0]['translation_text'])

✨ Features

Bidirectional Translation: Supports multiple language pairs including Wolof - French, French - Wolof, English - Wolof, Wolof - English, French - English, and English - French.
Fine - Tuned Model: Based on the nllb - 200 - distilled - 600M model, fine - tuned for better performance.
Preprocessed Datasets: Trained on preprocessed datasets bilalfaye/english - wolof - french - translation and bilalfaye/english - wolof - french - translation - bis to enhance translation quality.

📦 Installation

To use this model, you need to install the transformers library:

!pip install transformers

💻 Usage Examples

Basic Usage

The above code snippets for manual inference and inference with pipeline show basic usage examples.

Advanced Usage

You can adjust parameters in the translate function, such as a, b, max_input_length, and num_beams to optimize translation results according to different scenarios.

📚 Documentation

Model Description

This model is a fine - tuned version of nllb - 200 - distilled - 600M, specifically adapted for French - Wolof and Wolof - French translation. It was trained using the bilalfaye/english - wolof - french - translation and bilalfaye/english - wolof - french - translation - bis datasets, which underwent significant preprocessing to enhance translation quality.

The model supports bidirectional translation:

Wolof to French
French to Wolof
English to Wolof
Wolof to English
French to English
English to French

Test application on : https://huggingface.co/spaces/bilalfaye/WoFrEn - Translator

Package Versions

This model was developed and tested using the following package versions:

Property	Details
transformers	4.41.2
torch	2.4.0+cu121
datasets	3.2.0
sentencepiece	0.2.0
sacrebleu	2.5.1

🔧 Technical Details

The model is based on the nllb - 200 - distilled - 600M architecture. The datasets used for training have been preprocessed to improve the model's translation performance. The model supports multiple language pairs through fine - tuning and can be used for both manual inference and inference with a pipeline.

📄 License

This model is released under the MIT license.

Author

Bila Faye

Feel free to reach out for questions or improvements!

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご