🚀 rut5-base-summ-dialogsum
該模型是基於 d0rj/rut5-base-summ 在 None 數據集上進行微調得到的版本。它在評估集上取得了以下結果:
- 損失值:1.1263
- Rouge1:33.5111
- Rouge2:0.1696
- Rougel:33.4559
- Rougelsum:33.4934
- 生成長度:4.1546
📚 詳細文檔
模型信息
屬性 |
詳情 |
基礎模型 |
d0rj/rut5-base-summ |
標籤 |
generated_from_trainer |
評估指標 |
rouge |
訓練過程
訓練超參數
在訓練過程中使用了以下超參數:
- 學習率:5e-05
- 訓練批次大小:8
- 評估批次大小:8
- 隨機種子:42
- 優化器:Adam(β1 = 0.9,β2 = 0.999,ε = 1e-08)
- 學習率調度器類型:線性
- 訓練輪數:25
訓練結果
訓練損失 |
輪數 |
步數 |
驗證損失 |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
生成長度 |
2.0946 |
1.0 |
786 |
1.7462 |
45.4252 |
0.0 |
45.4009 |
45.4139 |
4.0464 |
1.7182 |
2.0 |
1572 |
1.5005 |
44.9295 |
0.0 |
44.9183 |
44.9108 |
4.1126 |
1.5304 |
3.0 |
2358 |
1.3826 |
39.5888 |
0.0 |
39.5811 |
39.5646 |
4.1698 |
1.4261 |
4.0 |
3144 |
1.3121 |
30.1735 |
0.0 |
30.1127 |
30.1415 |
4.1520 |
1.3252 |
5.0 |
3930 |
1.2641 |
35.7738 |
0.0 |
35.7408 |
35.7858 |
3.8791 |
1.2878 |
6.0 |
4716 |
1.2353 |
33.0773 |
0.0 |
32.9682 |
33.0551 |
3.7252 |
1.2068 |
7.0 |
5502 |
1.2051 |
34.4094 |
0.0 |
34.3902 |
34.3884 |
3.7729 |
1.1763 |
8.0 |
6288 |
1.1952 |
33.0914 |
0.1908 |
33.0267 |
33.0472 |
3.9739 |
1.1346 |
9.0 |
7074 |
1.1798 |
33.9606 |
0.0 |
33.9335 |
33.979 |
4.1768 |
1.1044 |
10.0 |
7860 |
1.1632 |
32.9529 |
0.0 |
32.9367 |
32.9396 |
4.1673 |
1.1073 |
11.0 |
8646 |
1.1499 |
34.0904 |
0.0 |
34.0659 |
34.1317 |
4.1934 |
1.0619 |
12.0 |
9432 |
1.1516 |
32.9502 |
0.0 |
32.9056 |
32.9376 |
4.0312 |
1.0365 |
13.0 |
10218 |
1.1478 |
31.68 |
0.0 |
31.6488 |
31.7003 |
4.0293 |
1.0161 |
14.0 |
11004 |
1.1427 |
32.6651 |
0.0424 |
32.6345 |
32.6538 |
4.1113 |
0.9805 |
15.0 |
11790 |
1.1343 |
34.0304 |
0.0636 |
33.9433 |
33.999 |
4.0674 |
0.9661 |
16.0 |
12576 |
1.1309 |
34.8704 |
0.0848 |
34.8014 |
34.8501 |
4.0681 |
0.9511 |
17.0 |
13362 |
1.1348 |
32.8744 |
0.0 |
32.8277 |
32.8547 |
4.1081 |
0.9392 |
18.0 |
14148 |
1.1326 |
32.9349 |
0.1908 |
32.8895 |
32.9376 |
4.2627 |
0.9341 |
19.0 |
14934 |
1.1263 |
33.5111 |
0.1696 |
33.4559 |
33.4934 |
4.1546 |
0.9396 |
20.0 |
15720 |
1.1349 |
33.9121 |
0.2545 |
33.8438 |
33.8993 |
4.1705 |
0.9314 |
21.0 |
16506 |
1.1276 |
33.0779 |
0.106 |
33.0546 |
33.0903 |
4.1399 |
0.8987 |
22.0 |
17292 |
1.1333 |
33.8566 |
0.1696 |
33.7943 |
33.843 |
4.1419 |
0.8895 |
23.0 |
18078 |
1.1343 |
33.6108 |
0.1484 |
33.5738 |
33.636 |
4.2328 |
0.8847 |
24.0 |
18864 |
1.1355 |
33.4257 |
0.2757 |
33.3804 |
33.4495 |
4.1711 |
0.8832 |
25.0 |
19650 |
1.1355 |
33.6211 |
0.3393 |
33.5937 |
33.636 |
4.1959 |
框架版本
- Transformers 4.35.2
- Pytorch 2.0.1+cu117
- Datasets 2.15.0
- Tokenizers 0.15.0