đ wav2vec2_base_10k_8khz_pt_cv7_2
This model is a fine - tuned version of lgris/seasr_2022_base_10k_8khz_pt on the common_voice dataset, achieving certain results on the evaluation set.
đ Quick Start
This model is a fine - tuned version of lgris/seasr_2022_base_10k_8khz_pt on the common_voice dataset. It achieves the following results on the evaluation set:
- Loss: 76.3426
- Wer: 0.1979
đ Documentation
Model Information
Property |
Details |
Model Type |
Fine - tuned version of lgris/seasr_2022_base_10k_8khz_pt on the common_voice dataset |
Training Data |
mozilla - foundation/common_voice_7_0 |
Tags |
automatic - speech - recognition, generated_from_trainer, hf - asr - leaderboard, mozilla - foundation/common_voice_7_0, pt, robust - speech - event |
Results on Datasets
Task |
Dataset |
Metrics |
Value |
Automatic Speech Recognition |
Common Voice 7 (pt) |
Test WER |
36.9 |
Automatic Speech Recognition |
Common Voice 7 (pt) |
Test CER |
14.82 |
Automatic Speech Recognition |
Robust Speech Event - Dev Data (sv) |
Test WER |
40.53 |
Automatic Speech Recognition |
Robust Speech Event - Dev Data (sv) |
Test CER |
16.95 |
Automatic Speech Recognition |
Robust Speech Event - Dev Data (pt) |
Test WER |
37.15 |
Automatic Speech Recognition |
Robust Speech Event - Test Data (pt) |
Test WER |
38.95 |
đ§ Technical Details
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e - 05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- training_steps: 10000
Training Results
Training Loss |
Epoch |
Step |
Validation Loss |
Wer |
189.1362 |
0.65 |
500 |
80.6347 |
0.2139 |
174.2587 |
1.3 |
1000 |
80.2062 |
0.2116 |
164.676 |
1.95 |
1500 |
78.2161 |
0.2073 |
176.5856 |
2.6 |
2000 |
78.8920 |
0.2074 |
164.3583 |
3.25 |
2500 |
77.2865 |
0.2066 |
161.414 |
3.9 |
3000 |
77.8888 |
0.2048 |
158.283 |
4.55 |
3500 |
77.3472 |
0.2033 |
159.2265 |
5.19 |
4000 |
79.0953 |
0.2036 |
156.3967 |
5.84 |
4500 |
76.6855 |
0.2029 |
154.2743 |
6.49 |
5000 |
77.7785 |
0.2015 |
156.6497 |
7.14 |
5500 |
77.1220 |
0.2033 |
157.3038 |
7.79 |
6000 |
76.2926 |
0.2027 |
162.8151 |
8.44 |
6500 |
76.7602 |
0.2013 |
151.8613 |
9.09 |
7000 |
77.4777 |
0.2011 |
153.0225 |
9.74 |
7500 |
76.5206 |
0.2001 |
157.52 |
10.39 |
8000 |
76.1061 |
0.2006 |
145.0592 |
11.04 |
8500 |
76.7855 |
0.1992 |
150.0066 |
11.69 |
9000 |
76.0058 |
0.1988 |
146.8128 |
12.34 |
9500 |
76.2853 |
0.1987 |
146.9148 |
12.99 |
10000 |
76.3426 |
0.1979 |
Framework Versions
- Transformers 4.16.2
- Pytorch 1.10.0+cu111
- Datasets 1.18.3
- Tokenizers 0.11.0
đ License
This model is under the apache - 2.0 license.