đ Moroccan Darija to English Translation Model (Fine-Tuned mBART)
This model is a fine - tuned version of mBART, tailored for translating Moroccan Darija to English. It leverages Facebook's mBART, a multilingual model, and is trained on a Moroccan Darija dataset for accurate translations.
đ Quick Start
You can easily load the model and tokenizer using the Hugging Face transformers
library. Here's an example:
from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
model_name = 'echarif/mBART_for_darija_transaltion'
model = MBartForConditionalGeneration.from_pretrained(model_name)
tokenizer = MBart50TokenizerFast.from_pretrained(model_name)
input_text = "insert your Moroccan Darija sentence here"
inputs = tokenizer(input_text, return_tensors="pt", padding=True)
translated_tokens = model.generate(**inputs)
translated_text = tokenizer.decode(translated_tokens[0], skip_special_tokens=True)
print(f"Translated Text: {translated_text}")
⨠Features
- Accurate Translation: Specifically fine - tuned for Moroccan Darija to English translation, ensuring high - quality results for conversational and informal text.
- Multilingual Base: Built on the mBART multilingual model, capable of handling various language - related tasks.
đĻ Installation
The installation mainly involves setting up the Hugging Face transformers
library. You can install it using the following command:
pip install transformers
đģ Usage Examples
Basic Usage
from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
model_name = 'echarif/mBART_for_darija_transaltion'
model = MBartForConditionalGeneration.from_pretrained(model_name)
tokenizer = MBart50TokenizerFast.from_pretrained(model_name)
input_text = "insert your Moroccan Darija sentence here"
inputs = tokenizer(input_text, return_tensors="pt", padding=True)
translated_tokens = model.generate(**inputs)
translated_text = tokenizer.decode(translated_tokens[0], skip_special_tokens=True)
print(f"Translated Text: {translated_text}")
đ Documentation
Model Overview
Property |
Details |
Model Type |
mBART (Multilingual BART) |
Language Pair |
Moroccan Darija â English |
Task |
Machine Translation |
Training Data |
The model was fine - tuned on a custom dataset containing Moroccan Darija to English translation pairs. |
Model Details
The mBART model is a transformer - based sequence - to - sequence model, designed to handle multiple languages. It is particularly useful for tasks such as translation, text generation, and summarization.
For this specific task, the model has been fine - tuned to accurately translate text from Moroccan Darija to English, making it suitable for applications involving the translation of conversational and informal text from Morocco.
Intended Use
This model can be used to:
- Translate sentences from Moroccan Darija to English.
đ License
This model is released under the Apache 2.0 license.