đ wav2vec2-base-timit-demo-colab971
This model is a fine - tuned version of facebook/wav2vec2-base on the None dataset. It achieves the following results on the evaluation set:
đ Quick Start
This section provides a high - level overview of the model and its performance. The model is based on the pre - trained facebook/wav2vec2-base and fine - tuned on a specific dataset (not specified). The evaluation results show its performance in terms of loss and word error rate (Wer).
đ Documentation
Model description
This model is a fine - tuned version of facebook/wav2vec2-base. However, more detailed information about the model's architecture, how the fine - tuning affects its performance, etc., is yet to be provided.
Intended uses & limitations
More information about the intended uses of this model and its limitations needs to be added. This could include the types of speech recognition tasks it is best suited for, and any scenarios where it may not perform well.
Training and evaluation data
Details about the training and evaluation data are lacking. Information such as the source of the data, its size, and the characteristics of the speech samples would be valuable.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 2
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 30
- mixed_precision_training: Native AMP
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Wer |
4.9461 |
1.77 |
500 |
3.2175 |
1.0 |
2.5387 |
3.53 |
1000 |
1.2239 |
0.7851 |
0.9632 |
5.3 |
1500 |
0.7275 |
0.6352 |
0.6585 |
7.07 |
2000 |
0.6218 |
0.5896 |
0.4875 |
8.83 |
2500 |
0.5670 |
0.5651 |
0.397 |
10.6 |
3000 |
0.5796 |
0.5487 |
0.3298 |
12.37 |
3500 |
0.5870 |
0.5322 |
0.2816 |
14.13 |
4000 |
0.5796 |
0.5016 |
0.2396 |
15.9 |
4500 |
0.5956 |
0.5040 |
0.2019 |
17.67 |
5000 |
0.5911 |
0.4847 |
0.1845 |
19.43 |
5500 |
0.6050 |
0.4800 |
0.1637 |
21.2 |
6000 |
0.6518 |
0.4927 |
0.1428 |
22.97 |
6500 |
0.6247 |
0.4645 |
0.1319 |
24.73 |
7000 |
0.6592 |
0.4711 |
0.1229 |
26.5 |
7500 |
0.6526 |
0.4556 |
0.1111 |
28.27 |
8000 |
0.6551 |
0.4448 |
Framework versions
- Transformers 4.11.3
- Pytorch 1.11.0+cu113
- Datasets 1.18.3
- Tokenizers 0.10.3
đ License
This model is licensed under the Apache 2.0 license.