đ mt5-small-finetuned-oneindia
This model is a fine - tuned version of google/mt5-small, which can generate summaries for news articles.
đ Quick Start
This model is a fine - tuned version of google/mt5-small on https://www.kaggle.com/datasets/akhisreelibra/malayalam-news dataset.
It achieves the following results on the evaluation set:
- Loss: 2.0277
- Rouge1: 6.1146
- Rouge2: 0.9858
- Rougel: 6.085
- Rougelsum: 6.0965
⨠Features
- Model outputs summarised one liner to an article.
- Intended use in news domain, can be used to generate headlines to news articles.
đĻ Installation
No installation steps provided in the original document, so this section is skipped.
đģ Usage Examples
No code examples provided in the original document, so this section is skipped.
đ Documentation
Training and evaluation data
https://www.kaggle.com/datasets/akhisreelibra/malayalam-news
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e - 05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
3.1789 |
1.0 |
8994 |
2.2686 |
5.5492 |
0.8569 |
5.5448 |
5.5362 |
2.5859 |
2.0 |
17988 |
2.1751 |
5.7343 |
0.9775 |
5.7166 |
5.7195 |
2.4438 |
3.0 |
26982 |
2.1082 |
5.9718 |
0.9917 |
5.9503 |
5.9586 |
2.3584 |
4.0 |
35976 |
2.0730 |
6.1837 |
1.0286 |
6.1562 |
6.167 |
2.3015 |
5.0 |
44970 |
2.0512 |
6.0863 |
0.9656 |
6.0598 |
6.0643 |
2.2609 |
6.0 |
53964 |
2.0440 |
6.2119 |
0.9653 |
6.1859 |
6.1899 |
2.2344 |
7.0 |
62958 |
2.0303 |
6.2424 |
0.9835 |
6.2121 |
6.219 |
2.2166 |
8.0 |
71952 |
2.0277 |
6.1146 |
0.9858 |
6.085 |
6.0965 |
Framework versions
- Transformers 4.20.0
- Pytorch 1.11.0
- Datasets 2.1.0
- Tokenizers 0.12.1
đ§ Technical Details
No specific technical implementation details (more than 50 words) provided in the original document, so this section is skipped.
đ License
This model is licensed under the Apache 2.0 license.