🚀 aslandmsl
本模型是 Helsinki-NLP/opus-mt-es-es 在 None 數據集上的微調版本。它在評估集上取得了以下結果:
- 損失值:0.1788
- 模型準備時間:0.0058
- Bleu Msl:88.0304
- Bleu Asl:0
- Ter Msl:7.4110
- Ter Asl:100
📚 詳細文檔
模型描述
更多信息待補充。
預期用途與限制
更多信息待補充。
訓練和評估數據
更多信息待補充。
🔧 技術細節
訓練超參數
訓練期間使用了以下超參數:
- 學習率:1e-05
- 訓練批次大小:32
- 評估批次大小:64
- 隨機種子:42
- 優化器:使用 adamw_torch,其中 betas=(0.9,0.999),epsilon=1e-08,且無額外優化器參數
- 學習率調度器類型:線性
- 訓練輪數:30
- 混合精度訓練:原生自動混合精度(Native AMP)
訓練結果
訓練損失 |
輪數 |
步數 |
驗證損失 |
模型準備時間 |
Bleu Msl |
Bleu Asl |
Ter Msl |
Ter Asl |
無日誌記錄 |
1.0 |
225 |
1.5653 |
0.0058 |
6.5801 |
55.0209 |
107.8081 |
37.3399 |
無日誌記錄 |
2.0 |
450 |
0.9988 |
0.0058 |
36.1652 |
80.6836 |
45.5315 |
8.5089 |
1.7595 |
3.0 |
675 |
0.6401 |
0.0058 |
50.4479 |
83.7950 |
32.2672 |
7.3110 |
1.7595 |
4.0 |
900 |
0.4573 |
0.0058 |
61.3116 |
66.8757 |
25.3057 |
6.1545 |
0.6205 |
5.0 |
1125 |
0.3856 |
0.0058 |
66.5991 |
88.5773 |
21.8250 |
5.1219 |
0.6205 |
6.0 |
1350 |
0.3448 |
0.0058 |
43.1115 |
89.5128 |
31.3264 |
4.5849 |
0.3287 |
7.0 |
1575 |
0.3144 |
0.0058 |
65.9756 |
89.9086 |
20.4139 |
4.5023 |
0.3287 |
8.0 |
1800 |
0.2754 |
0.0058 |
45.0564 |
90.8438 |
28.8805 |
4.0479 |
0.2225 |
9.0 |
2025 |
0.2410 |
0.0058 |
72.2558 |
90.5190 |
16.5569 |
4.2131 |
0.2225 |
10.0 |
2250 |
0.2229 |
0.0058 |
72.6469 |
90.9231 |
15.6162 |
4.0892 |
0.2225 |
11.0 |
2475 |
0.2126 |
0.0058 |
73.4167 |
91.5905 |
14.9577 |
3.8827 |
0.1448 |
12.0 |
2700 |
0.2049 |
0.0058 |
74.4555 |
70.4375 |
14.8636 |
4.0892 |
0.1448 |
13.0 |
2925 |
0.1993 |
0.0058 |
73.3591 |
91.3585 |
15.0517 |
4.0066 |
0.11 |
14.0 |
3150 |
0.1958 |
0.0058 |
73.9381 |
91.3182 |
14.0169 |
3.8827 |
0.11 |
15.0 |
3375 |
0.1890 |
0.0058 |
75.5526 |
91.6437 |
14.2051 |
3.8001 |
0.0882 |
16.0 |
3600 |
0.1881 |
0.0058 |
73.7777 |
91.8284 |
14.4873 |
3.7588 |
0.0882 |
17.0 |
3825 |
0.1851 |
0.0058 |
75.4362 |
91.4902 |
14.2051 |
3.7588 |
0.0723 |
18.0 |
4050 |
0.1850 |
0.0058 |
75.6099 |
92.0202 |
14.4873 |
3.6349 |
0.0723 |
19.0 |
4275 |
0.1822 |
0.0058 |
76.2459 |
91.9730 |
14.0169 |
3.6349 |
0.0641 |
20.0 |
4500 |
0.1839 |
0.0058 |
75.0209 |
91.9730 |
14.0169 |
3.6349 |
0.0641 |
21.0 |
4725 |
0.1806 |
0.0058 |
75.7669 |
92.0658 |
13.8288 |
3.5936 |
0.0641 |
22.0 |
4950 |
0.1809 |
0.0058 |
76.2001 |
92.0484 |
13.2643 |
3.5936 |
0.0576 |
23.0 |
5175 |
0.1793 |
0.0058 |
75.9506 |
92.2068 |
13.7347 |
3.5109 |
0.0576 |
24.0 |
5400 |
0.1781 |
0.0058 |
76.3576 |
92.3340 |
13.4525 |
3.4696 |
0.0515 |
25.0 |
5625 |
0.1789 |
0.0058 |
75.8648 |
92.1142 |
13.3584 |
3.5936 |
0.0515 |
26.0 |
5850 |
0.1784 |
0.0058 |
76.3297 |
92.2886 |
12.8881 |
3.5109 |
0.0479 |
27.0 |
6075 |
0.1788 |
0.0058 |
76.0603 |
92.5564 |
13.2643 |
3.3870 |
0.0479 |
28.0 |
6300 |
0.1778 |
0.0058 |
76.3080 |
92.3287 |
13.0762 |
3.5109 |
0.0469 |
29.0 |
6525 |
0.1780 |
0.0058 |
76.3707 |
92.3287 |
13.0762 |
3.5109 |
0.0469 |
30.0 |
6750 |
0.1781 |
0.0058 |
76.3707 |
92.3287 |
13.0762 |
3.5109 |
框架版本
- Transformers 4.46.2
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3
📄 許可證
本項目採用 Apache-2.0 許可證。