đ wav2vec2-large-xls-r-300m-slovenian
This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA-FOUNDATION/COMMON_VOICE_7_0 - SL dataset, aiming to solve the problem of automatic speech recognition in Slovenian.
đ Quick Start
This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA-FOUNDATION/COMMON_VOICE_7_0 - SL dataset. It achieves the following results on the evaluation set:
⨠Features
- Language Support: Designed for Slovenian in automatic speech recognition tasks.
- Fine - Tuned: Based on the pre - trained facebook/wav2vec2-xls-r-300m model, fine - tuned on the MOZILLA - FOUNDATION/COMMON_VOICE_7_0 - SL dataset.
đĻ Installation
No installation steps provided in the original document, so this section is skipped.
đģ Usage Examples
No code examples provided in the original document, so this section is skipped.
đ Documentation
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 7e - 05
- train_batch_size: 32
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 100.0
- mixed_precision_training: Native AMP
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Wer |
1.785 |
12.5 |
1000 |
0.7465 |
0.6812 |
0.8989 |
25.0 |
2000 |
0.2495 |
0.2732 |
0.7118 |
37.5 |
3000 |
0.2126 |
0.2284 |
0.6367 |
50.0 |
4000 |
0.2049 |
0.2049 |
0.5763 |
62.5 |
5000 |
0.2116 |
0.2055 |
0.5196 |
75.0 |
6000 |
0.2111 |
0.1910 |
0.4949 |
87.5 |
7000 |
0.2131 |
0.1931 |
0.4797 |
100.0 |
8000 |
0.2093 |
0.1907 |
Framework versions
- Transformers 4.16.0.dev0
- Pytorch 1.10.1+cu102
- Datasets 1.18.3
- Tokenizers 0.11.0
đ§ Technical Details
No specific technical details (more than 50 words) provided in the original document, so this section is skipped.
đ License
This model is licensed under the Apache - 2.0 license.
đ Model Index
Property |
Details |
Model Type |
wav2vec2 - large - xls - r - 300m - slovenian |
Training Data |
MOZILLA - FOUNDATION/COMMON_VOICE_7_0 - SL |
Results
- Task: Automatic Speech Recognition
- Dataset: Common Voice 7 (mozilla - foundation/common_voice_7_0, args: sl)
- Metrics:
- Test WER: 18.97
- Test CER: 4.534
- Dataset: Robust Speech Event - Dev Data (speech - recognition - community - v2/dev_data, args: sl)
- Metrics:
- Test WER: 55.048
- Test CER: 22.739
- Dataset: Robust Speech Event - Test Data (speech - recognition - community - v2/eval_data, args: sl)