đ long-t5-base-govreport
This model is a fine - tuned version of google/long-t5-tglobal-base for text summarization, achieving good results on the evaluation set.
đ Quick Start
This model is a fine - tuned version of google/long-t5-tglobal-base on the None dataset.
It achieves the following results on the evaluation set:
- Gen Len: 787.34
- Loss: 1.5448
- Rouge1: 57.2303
- Rouge2: 24.9705
- Rougel: 26.8081
- Rougelsum: 54.2747
đ Documentation
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
Refer to the pszemraj/govreport-summarization-8192 dataset.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 3
- eval_batch_size: 1
- seed: 4299
- gradient_accumulation_steps: 128
- total_train_batch_size: 384
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 25.0
Training results
Training Loss |
Epoch |
Step |
Gen Len |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
2.1198 |
0.39 |
25 |
805.336 |
1.8720 |
29.4332 |
7.3761 |
17.0816 |
25.065 |
1.8609 |
0.78 |
50 |
833.404 |
1.7601 |
35.3533 |
10.6624 |
18.643 |
31.6979 |
1.7805 |
1.17 |
75 |
866.356 |
1.6833 |
36.5786 |
11.1185 |
20.0358 |
33.2116 |
1.7352 |
1.56 |
100 |
822.348 |
1.6524 |
40.5489 |
13.0695 |
20.1256 |
37.1369 |
1.7371 |
1.95 |
125 |
765.6 |
1.6294 |
43.8594 |
15.2962 |
20.7807 |
40.3461 |
1.6428 |
2.34 |
150 |
844.184 |
1.6055 |
44.5054 |
15.731 |
21.2582 |
40.9775 |
1.6567 |
2.73 |
175 |
857.236 |
1.6031 |
47.3641 |
16.9664 |
21.4998 |
43.994 |
1.5773 |
3.12 |
200 |
841.86 |
1.5855 |
47.2284 |
17.3099 |
21.6793 |
43.9018 |
1.5614 |
3.52 |
225 |
832.8 |
1.5883 |
46.4612 |
17.1368 |
21.5931 |
43.1184 |
1.5328 |
3.91 |
250 |
790.056 |
1.5730 |
46.5685 |
17.5423 |
22.2082 |
43.1811 |
1.5194 |
4.3 |
275 |
825.868 |
1.5690 |
47.6205 |
18.377 |
22.7639 |
44.3701 |
1.571 |
4.69 |
300 |
794.032 |
1.5676 |
49.2203 |
19.1109 |
22.8005 |
46.0679 |
1.4275 |
5.08 |
325 |
833.068 |
1.5656 |
50.6982 |
20.0278 |
23.5585 |
47.5036 |
1.4912 |
5.47 |
350 |
793.068 |
1.5625 |
50.3371 |
19.8639 |
23.3666 |
47.1898 |
1.4764 |
5.86 |
375 |
819.86 |
1.5532 |
50.9702 |
20.7532 |
23.8765 |
47.9915 |
1.3972 |
6.25 |
400 |
770.78 |
1.5564 |
49.279 |
19.4781 |
23.1018 |
46.1942 |
1.4479 |
6.64 |
425 |
806.244 |
1.5529 |
50.3317 |
20.2888 |
23.4454 |
47.3491 |
1.4567 |
7.03 |
450 |
787.48 |
1.5590 |
52.2209 |
21.2868 |
23.9284 |
49.1691 |
1.3933 |
7.42 |
475 |
842.664 |
1.5561 |
51.9578 |
20.5806 |
23.7177 |
48.9121 |
1.4245 |
7.81 |
500 |
813.772 |
1.5420 |
52.3725 |
21.7787 |
24.5209 |
49.4003 |
1.3033 |
8.2 |
525 |
824.66 |
1.5499 |
52.7839 |
21.589 |
24.5617 |
49.8609 |
1.3673 |
8.59 |
550 |
807.348 |
1.5530 |
53.2339 |
22.152 |
24.7587 |
50.2502 |
1.3634 |
8.98 |
575 |
767.952 |
1.5458 |
53.0293 |
22.3194 |
25.174 |
50.078 |
1.3095 |
9.37 |
600 |
856.252 |
1.5412 |
53.7658 |
22.5229 |
25.0448 |
50.708 |
1.3492 |
9.76 |
625 |
826.064 |
1.5389 |
51.8662 |
21.6229 |
24.6819 |
48.8648 |
1.3007 |
10.16 |
650 |
843.544 |
1.5404 |
53.6692 |
22.154 |
24.6218 |
50.6864 |
1.2729 |
10.55 |
675 |
808.764 |
1.5428 |
54.6479 |
23.3029 |
25.5647 |
51.6394 |
1.3758 |
10.94 |
700 |
800.152 |
1.5403 |
54.9418 |
23.3323 |
25.6087 |
51.9256 |
1.3357 |
11.33 |
725 |
814.496 |
1.5455 |
55.2511 |
23.5606 |
25.8237 |
52.3183 |
1.2817 |
11.72 |
750 |
811.144 |
1.5412 |
55.2847 |
23.6632 |
25.9341 |
52.3146 |
1.2771 |
12.11 |
775 |
852.704 |
1.5450 |
55.1956 |
23.5545 |
25.677 |
52.1841 |
1.2892 |
12.5 |
800 |
805.844 |
1.5369 |
54.9563 |
23.5105 |
25.8876 |
51.9568 |
1.2757 |
12.89 |
825 |
813.476 |
1.5467 |
56.4728 |
24.6875 |
26.4415 |
53.4939 |
1.2382 |
13.28 |
850 |
787.34 |
1.5448 |
57.2303 |
24.9705 |
26.8081 |
54.2747 |
Framework versions
- Transformers 4.25.0.dev0
- Pytorch 1.13.0+cu117
- Datasets 2.7.0
- Tokenizers 0.13.2
đ License
This model is under the Apache - 2.0 license.
Property |
Details |
Model Type |
Fine - tuned version of google/long-t5-tglobal-base |
Training Data |
pszemraj/govreport-summarization-8192 |