đ Banglish-to-Bangla Transliteration Model
This model is designed to convert Banglish (Bengali written in Roman script) into Bengali script, offering a practical solution for language conversion in various communication scenarios.
đ Quick Start
If you want to use this Banglish-to-Bangla transliteration model, you can refer to the following code example:
from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
model = MBartForConditionalGeneration.from_pretrained("your-username/banglish-to-bangla-mbart")
tokenizer = MBart50TokenizerFast.from_pretrained("your-username/banglish-to-bangla-mbart")
def translate(text):
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=64)
outputs = model.generate(inputs.input_ids, max_length=64, num_beams=5, early_stopping=True)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translate("ami tomake valobashi"))
⨠Features
- Direct Use: It can perform transliteration of Banglish text to Bengali script for social media, messaging, and formal communication.
- Downstream Use: It supports fine - tuning for translation tasks between Bengali and other languages and can be integrated into chatbots or virtual assistants.
đ Documentation
Model Details
Model Description
This model is designed to transliterate Banglish (Bengali written in Roman script) into Bengali script. It is fine - tuned from the facebook/mbart-large-50-many-to-many-mmt model using the SKNahin/bengali-transliteration-data dataset.
Property |
Details |
Developed by |
Md. Farhan Masud Shohag |
Model Type |
Sequence - to - Sequence (Translation) |
Language(s) |
Banglish â Bengali (bn_BD) |
License |
Apache 2.0 |
Fine - tuned from |
facebook/mbart-large-50-many-to-many-mmt |
Model Sources
Uses
Direct Use
- Transliteration of Banglish text to Bengali script for social media, messaging, and formal communication.
Downstream Use
- Fine - tuning for translation tasks between Bengali and other languages.
- Integration into chatbots or virtual assistants.
Out - of - Scope Use
- General - purpose language translation between unrelated languages.
- Handling code - mixed languages (e.g., Banglish + English combinations).
Bias, Risks, and Limitations
Biases
- The dataset may include informal phrases, potentially reducing performance on formal language.
- Performance may degrade for long or complex sentences.
Limitations
- Model performance may vary for rare phrases or slang.
- Does not support mixed language inputs effectively.
Recommendations
â ī¸ Important Note
Users should evaluate outputs for their specific use cases, especially in formal contexts. Additional filtering or pre - processing may be required.