đ xtreme_s_xlsr_300m_minds14
This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the GOOGLE/XTREME_S - MINDS14.ALL dataset. It offers high - performance evaluation results across multiple languages and metrics, providing valuable insights for speech - related tasks.
đ Quick Start
This model is ready to use for speech - related tasks. You can directly load it from Hugging Face and start making inferences.
đ Documentation
Model Evaluation Results
It achieves the following results on the evaluation set:
- Accuracy: 0.9033
- Accuracy Cs - cz: 0.9164
- Accuracy De - de: 0.9477
- Accuracy En - au: 0.9235
- Accuracy En - gb: 0.9324
- Accuracy En - us: 0.9326
- Accuracy Es - es: 0.9177
- Accuracy Fr - fr: 0.9444
- Accuracy It - it: 0.9167
- Accuracy Ko - kr: 0.8649
- Accuracy Nl - nl: 0.9450
- Accuracy Pl - pl: 0.9146
- Accuracy Pt - pt: 0.8940
- Accuracy Ru - ru: 0.8667
- Accuracy Zh - cn: 0.7291
- F1: 0.9015
- F1 Cs - cz: 0.9154
- F1 De - de: 0.9467
- F1 En - au: 0.9199
- F1 En - gb: 0.9334
- F1 En - us: 0.9308
- F1 Es - es: 0.9158
- F1 Fr - fr: 0.9436
- F1 It - it: 0.9135
- F1 Ko - kr: 0.8642
- F1 Nl - nl: 0.9440
- F1 Pl - pl: 0.9159
- F1 Pt - pt: 0.8883
- F1 Ru - ru: 0.8646
- F1 Zh - cn: 0.7249
- Loss: 0.4119
- Loss Cs - cz: 0.3790
- Loss De - de: 0.2649
- Loss En - au: 0.3459
- Loss En - gb: 0.2853
- Loss En - us: 0.2203
- Loss Es - es: 0.2731
- Loss Fr - fr: 0.1909
- Loss It - it: 0.3520
- Loss Ko - kr: 0.5431
- Loss Nl - nl: 0.2515
- Loss Pl - pl: 0.4113
- Loss Pt - pt: 0.4798
- Loss Ru - ru: 0.6470
- Loss Zh - cn: 1.1216
- Predict Samples: 4086
Training Procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 32
- eval_batch_size: 8
- seed: 42
- distributed_type: multi - GPU
- num_devices: 2
- total_train_batch_size: 64
- total_eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1500
- num_epochs: 50.0
- mixed_precision_training: Native AMP
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
F1 |
Accuracy |
2.6739 |
5.41 |
200 |
2.5687 |
0.0430 |
0.1190 |
1.4953 |
10.81 |
400 |
1.6052 |
0.5550 |
0.5692 |
0.6177 |
16.22 |
600 |
0.7927 |
0.8052 |
0.8011 |
0.3609 |
21.62 |
800 |
0.5679 |
0.8609 |
0.8609 |
0.4972 |
27.03 |
1000 |
0.5944 |
0.8509 |
0.8523 |
0.1799 |
32.43 |
1200 |
0.6194 |
0.8623 |
0.8621 |
0.1308 |
37.84 |
1400 |
0.5956 |
0.8569 |
0.8548 |
0.2298 |
43.24 |
1600 |
0.5201 |
0.8732 |
0.8743 |
0.0052 |
48.65 |
1800 |
0.3826 |
0.9106 |
0.9103 |
Framework versions
- Transformers 4.18.0.dev0
- Pytorch 1.10.2+cu113
- Datasets 2.0.1.dev0
- Tokenizers 0.11.6
đ License
This project is licensed under the Apache - 2.0 license.
Property |
Details |
Model Type |
Fine - tuned version of facebook/wav2vec2 - xls - r - 300m on GOOGLE/XTREME_S - MINDS14.ALL dataset |
Training Data |
GOOGLE/XTREME_S - MINDS14.ALL dataset |
Metrics |
F1, Accuracy |