The open-source NLLB200-Français-Wolof translation model - Freely enable effortless translation between French and Wolof.

Nllb200 Francais Wolof

Developed by Lahad

A model fine-tuned from Meta's NLLB-200 (600M parameter distilled version) specifically for French-Wolof mutual translation

Machine Translation

Transformers

Supports Multiple LanguagesOpen Source License:MIT #Low-resource language translation #French-Wolof mutual translation #NLLB fine-tuning

Downloads 66

Release Time : 2/9/2025

Model Overview

This model aims to enhance content accessibility between French and Wolof, supporting bidirectional translation

Model Features

Bidirectional Translation

Supports bidirectional translation between French and Wolof

Fine-tuned from NLLB-200

Optimized based on Meta's powerful NLLB-200 model, focusing on French-Wolof translation

Enhanced Content Accessibility

Aims to facilitate content exchange and understanding between French and Wolof

Model Capabilities

French to Wolof translation

Wolof to French translation

Text localization

Use Cases

Language Communication

Daily Communication Translation

Assists French and Wolof speakers in daily communication

Education

Language Learning Aid

Serves as a translation aid for French or Wolof learners

Content Localization

Document Translation

Localizes French content into Wolof, or vice versa

🚀 Model Card: NLLB-200 French-Wolof(🇫🇷↔️🇸🇳) Translation Model

This is a fine - tuned version of Meta's NLLB - 200 (600M distilled) model. It is specialized for French to Wolof translation, aiming to enhance the accessibility of content between these two languages.

🚀 Quick Start

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Lahad/nllb200-francais-wolof")
model = AutoModelForSeq2SeqLM.from_pretrained("Lahad/nllb200-francais-wolof")

# Translation function
def translate(text, max_length=128):
    inputs = tokenizer(
        text,
        max_length=max_length,
        padding="max_length",
        truncation=True,
        return_tensors="pt"
    )
    
    outputs = model.generate(
        input_ids=inputs["input_ids"],
        attention_mask=inputs["attention_mask"],
        forced_bos_token_id=tokenizer.convert_tokens_to_ids("wol_Latn"),
        max_length=max_length
    )
    
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

✨ Features

Direct Use:
- Facilitates text translation between French and Wolof.
- Enables content localization.
- Assists in language learning.
- Supports cross - cultural communication.
Out - of - Scope Use:
- Commercial use without proper licensing is prohibited.
- Not suitable for translating highly technical or specialized content.
- Inadequate for legal or medical document translation where professional human translation is required.
- Not designed for real - time speech translation.

📦 Installation

Since this is a Hugging Face model, you can use the transformers library to interact with it. You can install the transformers library using the following command:

pip install transformers

💻 Usage Examples

Basic Usage

# Example usage of the translation function
text = "This is a test sentence."
translation = translate(text)
print(translation)

Advanced Usage

# You can adjust the max_length parameter according to your needs
text = "This is a longer test sentence that requires more tokens to translate."
translation = translate(text, max_length = 256)
print(translation)

📚 Documentation

Model Details

Property	Details
Model Type	Sequence - to - Sequence Translation Model
Developed by	Lahad
Language(s)	French (fr_Latn) ↔️ Wolof (wol_Latn)
License	CC - BY - NC - 4.0
Finetuned from model	facebook/nllb - 200 - distilled - 600M
Repository	[Hugging Face - Lahad/nllb200 - francais - wolof](https://huggingface.co/Lahad/nllb200 - francais - wolof)
GitHub	[Fine - tuning NLLB - 200 for French - Wolof](https://github.com/LahadMbacke/Fine - tuning_facebook - nllb - 200 - distilled - 600M_French_to_Wolof)

Bias, Risks, and Limitations

Language Variety Limitations:
- Has limited coverage of regional Wolof dialects.
- May not handle cultural nuances effectively.
Technical Limitations:
- Has a maximum context window of 128 tokens.
- Shows reduced performance on technical/specialized content.
- May struggle with informal language and slang.
Potential Biases:
- Training data may reflect cultural biases.
- May perform better on standard/formal language.

Recommendations

Use the model for general communication and content translation.
Verify translations for critical communications.
Consider regional language variations.
Implement human review for sensitive content.
Test translations in the intended context before deployment.

🔧 Technical Details

Training Details

Training Data

Dataset: galsenai/centralized_wolof_french_translation_data
Split: 80% training, 20% testing
Format: JSON pairs of French and Wolof translations

Training Procedure

Preprocessing

Dynamic tokenization with padding.
Maximum sequence length: 128 tokens.
Source/target language tags: fr_Latn/wol_Latn.

Training Hyperparameters

Learning rate: 2e - 5
Batch size: 8 per device
Training epochs: 3
FP16 training: Enabled
Evaluation strategy: Per epoch

Evaluation

Testing Data: 20% of the dataset
Metrics:
- Cloud Provider: [Not Specified]
Evaluation Factors:
- Translation accuracy
- Semantic preservation
- Grammar correctness

Environmental Impact

Hardware Type: NVIDIA T4 GPU
Hours used: 5
Cloud Provider: [Not Specified]
Compute Region: [Not Specified]
Carbon Emitted: [Not Calculated]

Model Architecture and Objective

Architecture: NLLB - 200 (Distilled 600M version)
Objective: Neural Machine Translation
Parameters: 600M
Context Window: 128 tokens

Compute Infrastructure

Training Hardware: NVIDIA T4 GPU
Training Time: 5 hours
Software Framework: Hugging Face Transformers

📄 License

The model is licensed under CC - BY - NC - 4.0.

Model Card Contact

For questions about this model, please create an issue on the model's Hugging Face repository.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご