đ t5-small-finetuned-hausa-to-chinese
This model is a fine - tuned version of [google - t5/t5 - small](https://huggingface.co/google - t5/t5 - small) on the None dataset. It's designed for translation tasks, specifically from Hausa to Chinese. On the evaluation set, it achieves the following results:
- Loss: 0.3817
- Bleu: 30.2633
- Gen Len: 3.5559
đ Quick Start
This model can be used for Hausa - to - Chinese translation tasks. You can load it using the transformers
library and perform inference.
đĻ Installation
If you haven't installed the transformers
library yet, you can install it using the following command:
pip install transformers
đ Documentation
Model description
This model is a fine - tuned version of the base model [google - t5/t5 - small](https://huggingface.co/google - t5/t5 - small). However, more detailed information about its architecture and specific improvements is yet to be provided.
Intended uses & limitations
The model is primarily intended for Hausa - to - Chinese translation. But currently, more information about its intended uses and limitations is needed.
Training and evaluation data
Currently, more information about the training and evaluation data is needed.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
Property |
Details |
learning_rate |
0.0008 |
train_batch_size |
32 |
eval_batch_size |
64 |
seed |
42 |
optimizer |
Adam with betas=(0.9,0.999) and epsilon = 1e - 08 |
lr_scheduler_type |
cosine |
lr_scheduler_warmup_steps |
4000 |
num_epochs |
20 |
mixed_precision_training |
Native AMP |
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Bleu |
Gen Len |
0.6981 |
1.0 |
846 |
0.2900 |
14.2476 |
3.4917 |
0.3149 |
2.0 |
1692 |
0.2639 |
18.6104 |
3.4725 |
0.2782 |
3.0 |
2538 |
0.2467 |
9.1092 |
3.2542 |
0.2622 |
4.0 |
3384 |
0.2481 |
24.1345 |
3.4047 |
0.2428 |
5.0 |
4230 |
0.2529 |
16.9217 |
3.3965 |
0.2271 |
6.0 |
5076 |
0.2491 |
27.8491 |
3.5349 |
0.2047 |
7.0 |
5922 |
0.2507 |
16.6565 |
3.339 |
0.1902 |
8.0 |
6768 |
0.2506 |
25.6462 |
3.5667 |
0.1739 |
9.0 |
7614 |
0.2610 |
27.1673 |
3.5916 |
0.1587 |
10.0 |
8460 |
0.2438 |
29.306 |
3.5839 |
0.1425 |
11.0 |
9306 |
0.2660 |
29.08 |
3.6478 |
0.1251 |
12.0 |
10152 |
0.2721 |
29.9148 |
3.4994 |
0.1105 |
13.0 |
10998 |
0.2929 |
28.1895 |
3.5526 |
0.0956 |
14.0 |
11844 |
0.3010 |
30.552 |
3.5717 |
0.083 |
15.0 |
12690 |
0.3307 |
27.9728 |
3.5303 |
0.0724 |
16.0 |
13536 |
0.3404 |
27.1874 |
3.5146 |
0.0652 |
17.0 |
14382 |
0.3592 |
29.9567 |
3.5529 |
0.0568 |
18.0 |
15228 |
0.3774 |
30.5145 |
3.5668 |
0.0549 |
19.0 |
16074 |
0.3795 |
30.6604 |
3.5637 |
0.0526 |
20.0 |
16920 |
0.3817 |
30.2633 |
3.5559 |
Framework versions
Property |
Details |
Transformers |
4.44.2 |
Pytorch |
2.4.0+cu121 |
Datasets |
2.21.0 |
Tokenizers |
0.19.1 |
đ License
This model is released under the apache - 2.0 license.