đ romaneng2nep_v2
This model is a fine - tuned version of google/mt5-small on the syubraj/roman2nepali-transliteration dataset, aiming to achieve high - quality Romanized English to Nepali translation.

đ Quick Start
This model is a fine-tuned version of google/mt5-small on an syubraj/roman2nepali-transliteration.
It achieves the following results on the evaluation set:
- Loss: 2.9652
- Gen Len: 5.1538
⨠Features
đĻ Installation
!pip install transformers
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer, MT5ForConditionalGeneration
checkpoint = "syubraj/romaneng2nep_v3"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = MT5ForConditionalGeneration.from_pretrained(checkpoint)
max_seq_len = 20
def translate(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)
translated = model.generate(**inputs)
translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
return translated_text
source_text = "muskuraudai"
translated_text = translate(source_text)
print(f"Translated Text: {translated_text}")
đ Documentation
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
Property |
Details |
learning_rate |
2e-05 |
train_batch_size |
24 |
eval_batch_size |
24 |
seed |
42 |
optimizer |
Adam with betas=(0.9,0.999) and epsilon=1e-08 |
lr_scheduler_type |
linear |
num_epochs |
4 |
Training results
Step |
Training Loss |
Validation Loss |
Gen Len |
1000 |
15.0703 |
5.6154 |
2.3840 |
2000 |
6.0460 |
4.4449 |
4.6281 |
3000 |
5.2580 |
3.9632 |
4.7790 |
4000 |
4.8563 |
3.6188 |
5.0053 |
5000 |
4.5602 |
3.3491 |
5.3085 |
6000 |
4.3146 |
3.1572 |
5.2562 |
7000 |
4.1228 |
3.0084 |
5.2197 |
8000 |
3.9695 |
2.8727 |
5.2140 |
9000 |
3.8342 |
2.7651 |
5.1834 |
10000 |
3.7319 |
2.6661 |
5.1977 |
11000 |
3.6485 |
2.5864 |
5.1536 |
12000 |
3.5541 |
2.5080 |
5.1990 |
13000 |
3.4959 |
2.4464 |
5.1775 |
14000 |
3.4315 |
2.3931 |
5.1747 |
15000 |
3.3663 |
2.3401 |
5.1625 |
16000 |
3.3204 |
2.3034 |
5.1481 |
17000 |
3.2417 |
2.2593 |
5.1663 |
18000 |
3.2186 |
2.2283 |
5.1351 |
19000 |
3.1822 |
2.1946 |
5.1573 |
20000 |
3.1449 |
2.1690 |
5.1649 |
21000 |
3.1067 |
2.1402 |
5.1624 |
22000 |
3.0844 |
2.1258 |
5.1479 |
23000 |
3.0574 |
2.1066 |
5.1518 |
24000 |
3.0357 |
2.0887 |
5.1446 |
25000 |
3.0136 |
2.0746 |
5.1559 |
26000 |
2.9957 |
2.0609 |
5.1658 |
27000 |
2.9865 |
2.0510 |
5.1791 |
28000 |
2.9765 |
2.0456 |
5.1574 |
29000 |
2.9675 |
2.0386 |
5.1620 |
30000 |
2.9678 |
2.0344 |
5.1601 |
31000 |
2.9652 |
2.0320 |
5.1538 |
Framework versions
Property |
Details |
Transformers |
4.45.1 |
Pytorch |
2.4.0 |
Datasets |
3.0.1 |
Tokenizers |
0.20.0 |
Citation
If you find this model useful, please cite the work.
@misc {yubraj_sigdel_2024,
author = { {Yubraj Sigdel} },
title = { romaneng2nep_v3 (Revision dca017e) },
year = 2024,
url = { https://huggingface.co/syubraj/romaneng2nep_v3 },
doi = { 10.57967/hf/3252 },
publisher = { Hugging Face }
}
đ License
This model is licensed under the apache - 2.0 license.