đ Model Documentation: Wolof to French Translation with NLLB-200
This model is fine - tuned from Meta's NLLB - 200 for translating between Wolof and French, providing an efficient solution for cross - language communication.
đ Quick Start
This model is a machine translation model fine - tuned from Meta's NLLB - 200, specifically for translating between Wolof and French. It is hosted at cifope/nllb-200-wo-fr-distilled-600M
and uses a distilled version of the NLLB - 200 model optimized for Wolof - French translation tasks.
âš Features
- Language Support: Supports translation between Wolof and French.
- Metrics: Uses BLEU as an evaluation metric.
- Pipeline Tag: Suitable for translation tasks.
- Tags: Associated with text - generation - inference.
đŠ Installation
The model requires the transformers
library by Hugging Face. You can install it using the following command:
pip install transformers
đ» Usage Examples
Basic Usage
First, import the necessary classes from the transformers
library and initialize the model and tokenizer:
from transformers import AutoModelForSeq2SeqLM, NllbTokenizer
model = AutoModelForSeq2SeqLM.from_pretrained('cifope/nllb-200-wo-fr-distilled-600M')
tokenizer = NllbTokenizer.from_pretrained('facebook/nllb-200-distilled-600M')
Advanced Usage
Translate from French to Wolof
The translate
function can be used to translate text from French to Wolof:
def translate(text, src_lang='fra_Latn', tgt_lang='wol_Latn', a=16, b=1.5, max_input_length=1024, **kwargs):
tokenizer.src_lang = src_lang
tokenizer.tgt_lang = tgt_lang
inputs = tokenizer(text, return_tensors='pt', padding=True, truncation=True, max_length=max_input_length)
result = model.generate(
**inputs.to(model.device),
forced_bos_token_id=tokenizer.convert_tokens_to_ids(tgt_lang),
max_new_tokens=int(a + b * inputs.input_ids.shape[1]),
**kwargs
)
return tokenizer.batch_decode(result, skip_special_tokens=True)
Translate from Wolof to French
The reversed_translate
function can be used to translate text from Wolof to French:
def reversed_translate(text, src_lang='wol_Latn', tgt_lang='fra_Latn', a=16, b=1.5, max_input_length=1024, **kwargs):
tokenizer.src_lang = src_lang
tokenizer.tgt_lang = tgt_lang
inputs = tokenizer(text, return_tensors='pt', padding=True, truncation=True, max_length=max_input_length)
result = model.generate(
**inputs.to(model.device),
forced_bos_token_id=tokenizer.convert_tokens_to_ids(tgt_lang),
max_new_tokens=int(a + b * inputs.input_ids.shape[1]),
**kwargs
)
return tokenizer.batch_decode(result, skip_special_tokens=True)
Example of Using Translation Functions
french_text = "L'argent peut ĂȘtre Ă©changĂ© Ă la seule banque des Ăźles situĂ©e Ă Stanley"
wolof_translation = translate(french_text)
print(wolof_translation)
wolof_text = "alkaati yi tà mbali nañu xà ll léegi kilifa gi ñów"
french_translation = reversed_translate(wolof_text)
print(french_translation)
wolof_text = "alkaati yi tà mbali nañu xà ll léegi kilifa gi ñów"
english_translation = reversed_translate(wolof_text,tgt_lang="eng_Latn")
print(english_translation)
đ License
This project is licensed under the MIT license.