đ mt5-small-finetuned-gazeta-ru
This model is a fine - tuned version of google/mt5-small on the gazeta dataset. It specializes in text summarization and can generate high - quality summaries.
đ Quick Start
This model is a fine - tuned version of google/mt5-small on the gazeta dataset.
It achieves the following results on the evaluation set:
- Loss: 3.3287
- Rouge1: 2.9422
- Rouge2: 0.25
- Rougel: 2.9053
- Rougelsum: 2.9131
⨠Features
- Fine - tuned for Summarization: Based on the
google/mt5 - small
model, it is fine - tuned on the gazeta
dataset for text summarization tasks.
- Good Evaluation Metrics: Achieves decent scores on Rouge metrics, indicating its effectiveness in generating summaries.
đĻ Installation
No installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
No code examples are provided in the original document, so this section is skipped.
đ Documentation
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e - 05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
5.5708 |
1.0 |
1690 |
3.3106 |
1.8563 |
0.1911 |
1.8332 |
1.8348 |
4.0219 |
2.0 |
3380 |
3.3048 |
2.2018 |
0.1649 |
2.1978 |
2.2022 |
3.7276 |
3.0 |
5070 |
3.3320 |
3.2293 |
0.2173 |
3.194 |
3.2039 |
3.5835 |
4.0 |
6760 |
3.3308 |
3.2189 |
0.2932 |
3.1825 |
3.1841 |
3.4944 |
5.0 |
8450 |
3.3104 |
2.8833 |
0.1964 |
2.8521 |
2.8537 |
3.4203 |
6.0 |
10140 |
3.3032 |
2.9914 |
0.2723 |
2.9516 |
2.9542 |
3.3774 |
7.0 |
11830 |
3.3232 |
2.9982 |
0.3063 |
2.965 |
2.9642 |
3.348 |
8.0 |
13520 |
3.3287 |
2.9422 |
0.25 |
2.9053 |
2.9131 |
Framework versions
- Transformers 4.42.4
- Pytorch 2.3.1+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
đ§ Technical Details
No detailed technical implementation information is provided in the original document, so this section is skipped.
đ License
đ Model Information
Property |
Details |
Model Type |
Fine - tuned version of google/mt5 - small |
Training Data |
gazeta |
Metrics |
rouge |