🚀 t5-small_6_3-hi_en-to-en
This model is designed for translation tasks, specifically converting text from Hinglish (hi_en) to English (en). It was trained from scratch on the cmu_hinglish_dog dataset and has achieved promising results on the evaluation set.
🚀 Quick Start
The model is ready to use for Hinglish to English translation tasks. You can integrate it into your projects following the general procedures for using similar models in the relevant framework.
✨ Features
- Trained from Scratch: The model was trained from scratch on the cmu_hinglish_dog dataset, which may lead to better performance on related tasks.
- Good Evaluation Results: It achieves a Bleu score of 18.0863 on the test set, indicating its effectiveness in translation tasks.
📦 Installation
No specific installation steps are provided in the original document.
💻 Usage Examples
No code examples are provided in the original document.
📚 Documentation
Model description
Model generated using:
make_student.py t5-small t5_small_6_3 6 3
Check this link for more information.
Intended uses & limitations
More information needed
Training and evaluation data
Used cmu_hinglish_dog dataset. Please check this link for dataset description
Translation
- Source: hi_en: The text in Hinglish
- Target: en: The text in English
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
Property |
Details |
learning_rate |
5e-05 |
train_batch_size |
32 |
eval_batch_size |
32 |
seed |
42 |
gradient_accumulation_steps |
2 |
total_train_batch_size |
64 |
optimizer |
Adam with betas=(0.9,0.999) and epsilon=1e-08 |
lr_scheduler_type |
linear |
num_epochs |
100 |
mixed_precision_training |
Native AMP |
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Bleu |
Gen Len |
No log |
1.0 |
126 |
3.0601 |
4.7146 |
11.9904 |
No log |
2.0 |
252 |
2.8885 |
5.9584 |
12.3418 |
No log |
3.0 |
378 |
2.7914 |
6.649 |
12.3758 |
3.4671 |
4.0 |
504 |
2.7347 |
7.3305 |
12.3854 |
3.4671 |
5.0 |
630 |
2.6832 |
8.3132 |
12.4268 |
3.4671 |
6.0 |
756 |
2.6485 |
8.339 |
12.3641 |
3.4671 |
7.0 |
882 |
2.6096 |
8.7269 |
12.414 |
3.0208 |
8.0 |
1008 |
2.5814 |
9.2163 |
12.2675 |
3.0208 |
9.0 |
1134 |
2.5542 |
9.448 |
12.3875 |
3.0208 |
10.0 |
1260 |
2.5339 |
9.9011 |
12.4321 |
3.0208 |
11.0 |
1386 |
2.5043 |
9.7529 |
12.5149 |
2.834 |
12.0 |
1512 |
2.4848 |
9.9606 |
12.4193 |
2.834 |
13.0 |
1638 |
2.4737 |
9.9368 |
12.3673 |
2.834 |
14.0 |
1764 |
2.4458 |
10.3182 |
12.4352 |
2.834 |
15.0 |
1890 |
2.4332 |
10.486 |
12.4671 |
2.7065 |
16.0 |
2016 |
2.4239 |
10.6921 |
12.414 |
2.7065 |
17.0 |
2142 |
2.4064 |
10.7426 |
12.4607 |
2.7065 |
18.0 |
2268 |
2.3941 |
11.0509 |
12.4087 |
2.7065 |
19.0 |
2394 |
2.3826 |
11.2407 |
12.3386 |
2.603 |
20.0 |
2520 |
2.3658 |
11.3711 |
12.3992 |
2.603 |
21.0 |
2646 |
2.3537 |
11.42 |
12.5032 |
2.603 |
22.0 |
2772 |
2.3475 |
12.0665 |
12.5074 |
2.603 |
23.0 |
2898 |
2.3398 |
12.0343 |
12.4342 |
2.5192 |
24.0 |
3024 |
2.3298 |
12.1011 |
12.5096 |
2.5192 |
25.0 |
3150 |
2.3216 |
12.2562 |
12.4809 |
2.5192 |
26.0 |
3276 |
2.3131 |
12.4585 |
12.4427 |
2.5192 |
27.0 |
3402 |
2.3052 |
12.7094 |
12.534 |
2.4445 |
28.0 |
3528 |
2.2984 |
12.7432 |
12.5053 |
2.4445 |
29.0 |
3654 |
2.2920 |
12.8409 |
12.4501 |
2.4445 |
30.0 |
3780 |
2.2869 |
12.6365 |
12.4936 |
2.4445 |
31.0 |
3906 |
2.2777 |
12.8523 |
12.5234 |
2.3844 |
32.0 |
4032 |
2.2788 |
12.9216 |
12.4204 |
2.3844 |
33.0 |
4158 |
2.2710 |
12.9568 |
12.5064 |
2.3844 |
34.0 |
4284 |
2.2643 |
12.9641 |
12.4299 |
2.3844 |
35.0 |
4410 |
2.2621 |
12.9787 |
12.448 |
2.3282 |
36.0 |
4536 |
2.2554 |
13.1264 |
12.4374 |
2.3282 |
37.0 |
4662 |
2.2481 |
13.1853 |
12.4416 |
2.3282 |
38.0 |
4788 |
2.2477 |
13.3259 |
12.4119 |
2.3282 |
39.0 |
4914 |
2.2448 |
13.2017 |
12.4278 |
2.2842 |
40.0 |
5040 |
2.2402 |
13.3772 |
12.4437 |
2.2842 |
41.0 |
5166 |
2.2373 |
13.2184 |
12.414 |
2.2842 |
42.0 |
5292 |
2.2357 |
13.5267 |
12.4342 |
2.2842 |
43.0 |
5418 |
2.2310 |
13.5754 |
12.4087 |
2.2388 |
44.0 |
5544 |
2.2244 |
13.653 |
12.4427 |
2.2388 |
45.0 |
5670 |
2.2243 |
13.6028 |
12.431 |
2.2388 |
46.0 |
5796 |
2.2216 |
13.7128 |
12.4151 |
2.2388 |
47.0 |
5922 |
2.2231 |
13.749 |
12.4172 |
2.2067 |
48.0 |
6048 |
2.2196 |
13.7256 |
12.4034 |
2.2067 |
49.0 |
6174 |
2.2125 |
13.8237 |
12.396 |
2.2067 |
50.0 |
6300 |
2.2131 |
13.6642 |
12.4416 |
2.2067 |
51.0 |
6426 |
2.2115 |
13.8876 |
12.4119 |
2.1688 |
52.0 |
6552 |
2.2091 |
14.0323 |
12.4639 |
2.1688 |
53.0 |
6678 |
2.2082 |
13.916 |
12.3843 |
2.1688 |
54.0 |
6804 |
2.2071 |
13.924 |
12.3758 |
2.1688 |
55.0 |
6930 |
2.2046 |
13.9563 |
12.4416 |
2.1401 |
56.0 |
7056 |
2.2020 |
14.0592 |
12.483 |
2.1401 |
57.0 |
7182 |
2.2047 |
13.8879 |
12.4076 |
2.1401 |
58.0 |
7308 |
2.2018 |
13.9267 |
12.3949 |
2.1401 |
59.0 |
7434 |
2.1964 |
14.0518 |
12.4363 |
2.1092 |
60.0 |
7560 |
2.1926 |
14.1518 |
12.4883 |
2.1092 |
61.0 |
7686 |
2.1972 |
14.132 |
12.4034 |
2.1092 |
62.0 |
7812 |
2.1939 |
14.2066 |
12.4151 |
2.1092 |
63.0 |
7938 |
2.1905 |
14.2923 |
12.4459 |
2.0932 |
64.0 |
8064 |
2.1932 |
14.2476 |
12.3418 |
2.0932 |
65.0 |
8190 |
2.1925 |
14.2057 |
12.3907 |
2.0932 |
66.0 |
8316 |
2.1906 |
14.2978 |
12.4055 |
2.0932 |
67.0 |
8442 |
2.1903 |
14.3276 |
12.4427 |
2.0706 |
68.0 |
8568 |
2.1918 |
14.4681 |
12.4034 |
2.0706 |
69.0 |
8694 |
2.1882 |
14.3751 |
12.4225 |
2.0706 |
70.0 |
8820 |
2.1870 |
14.5904 |
12.4204 |
2.0706 |
71.0 |
8946 |
2.1865 |
14.6409 |
12.4512 |
2.0517 |
72.0 |
9072 |
2.1831 |
14.6505 |
12.4352 |
2.0517 |
73.0 |
9198 |
2.1835 |
14.7485 |
12.4363 |
2.0517 |
74.0 |
9324 |
2.1824 |
14.7344 |
12.4586 |
2.0517 |
75.0 |
9450 |
2.1829 |
14.8097 |
12.4575 |
2.0388 |
76.0 |
9576 |
2.1822 |
14.6681 |
12.4108 |
2.0388 |
77.0 |
9702 |
2.1823 |
14.6421 |
12.4342 |
2.0388 |
78.0 |
9828 |
2.1816 |
14.7014 |
12.4459 |
2.0388 |
79.0 |
9954 |
2.1810 |
14.744 |
12.4565 |
2.0224 |
80.0 |
10080 |
2.1839 |
14.7889 |
12.4437 |
2.0224 |
81.0 |
10206 |
2.1793 |
14.802 |
12.4565 |
2.0224 |
82.0 |
10332 |
2.1776 |
14.7702 |
12.4214 |
2.0224 |
83.0 |
10458 |
2.1809 |
14.6772 |
12.4236 |
2.0115 |
84.0 |
10584 |
2.1786 |
14.709 |
12.4214 |
2.0115 |
85.0 |
10710 |
2.1805 |
14.7693 |
12.3981 |
2.0115 |
86.0 |
10836 |
2.1790 |
14.7628 |
12.4172 |
2.0115 |
87.0 |
10962 |
2.1785 |
14.7538 |
12.3992 |
2.0007 |
88.0 |
11088 |
2.1788 |
14.7493 |
12.3726 |
2.0007 |
89.0 |
11214 |
2.1788 |
14.8793 |
12.4045 |
2.0007 |
90.0 |
11340 |
2.1786 |
14.8318 |
12.3747 |
2.0007 |
91.0 |
11466 |
2.1769 |
14.8061 |
12.4013 |
1.9967 |
92.0 |
11592 |
2.1757 |
14.8108 |
12.3843 |
1.9967 |
93.0 |
11718 |
2.1747 |
14.8036 |
12.379 |
1.9967 |
94.0 |
11844 |
2.1764 |
14.7447 |
12.3737 |
1.9967 |
95.0 |
11970 |
2.1759 |
14.7759 |
12.3875 |
1.9924 |
96.0 |
12096 |
2.1760 |
14.7695 |
12.3875 |
1.9924 |
97.0 |
12222 |
2.1762 |
14.8022 |
12.3769 |
1.9924 |
98.0 |
12348 |
2.1763 |
14.7519 |
12.3822 |
1.9924 |
99.0 |
12474 |
2.1760 |
14.7756 |
12.3832 |
1.9903 |
100.0 |
12600 |
2.1761 |
14.7713 |
12.3822 |
Evaluation results
Data Split |
Bleu |
Validation |
17.8061 |
Test |
18.0863 |
Framework versions
- Transformers 4.20.0.dev0
- Pytorch 1.8.0
- Datasets 2.1.0
- Tokenizers 0.12.1