🚀 Whisper Telugu - Fine-tuned
This model is a fine - tuned version of openai/whisper-large-v2 on the Telugu Audio Dataset. It offers significant value in the field of automatic speech recognition for the Telugu language, providing a more accurate and efficient solution for transcribing Telugu audio.
📚 Documentation
Model Information
Property |
Details |
Library Name |
transformers |
Language |
te |
License |
apache - 2.0 |
Base Model |
openai/whisper-large-v2 |
Tags |
generated_from_trainer |
Datasets |
sagarchapara/telugu-audio |
Metrics |
wer |
Model Performance
This model achieves the following results on the evaluation set:
- Loss: 3.5889
- Wer: 92.3967
Model Index
- Name: Whisper Telugu - Fine-tuned
- Results:
- Task:
- Name: Automatic Speech Recognition
- Type: automatic-speech-recognition
- Dataset:
- Name: Telugu Audio Dataset
- Type: sagarchapara/telugu-audio
- Config: te_in
- Split: None
- Args: 'split: train'
- Metrics:
- Name: Wer
- Type: wer
- Value: 92.39665881345041
Training and Evaluation
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 4
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon = 1e - 08 and optimizer_args = No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 10000
- mixed_precision_training: Native AMP
Training Results
Training Loss |
Epoch |
Step |
Validation Loss |
Wer |
0.384 |
0.1797 |
250 |
0.9966 |
96.1662 |
0.434 |
0.3595 |
500 |
1.4886 |
98.5007 |
0.4014 |
0.5392 |
750 |
1.4760 |
97.7940 |
0.3318 |
0.7189 |
1000 |
1.5314 |
97.7511 |
0.3014 |
0.8986 |
1250 |
1.5504 |
97.8368 |
0.2213 |
1.0784 |
1500 |
1.6095 |
97.3656 |
0.2212 |
1.2581 |
1750 |
1.6825 |
96.1662 |
0.2323 |
1.4378 |
2000 |
1.5175 |
97.6012 |
0.2049 |
1.6175 |
2250 |
2.0035 |
97.7940 |
0.1834 |
1.7973 |
2500 |
1.6968 |
96.4232 |
0.2012 |
1.9770 |
2750 |
1.7613 |
97.3013 |
0.1426 |
2.1567 |
3000 |
1.5106 |
95.9734 |
0.1344 |
2.3364 |
3250 |
1.7199 |
95.5665 |
0.1512 |
2.5162 |
3500 |
1.9328 |
94.8169 |
0.1346 |
2.6959 |
3750 |
1.7806 |
96.0805 |
0.1211 |
2.8756 |
4000 |
2.0429 |
95.6736 |
0.0824 |
3.0554 |
4250 |
2.0699 |
95.3309 |
0.0936 |
3.2351 |
4500 |
2.0379 |
96.1876 |
0.0946 |
3.4148 |
4750 |
2.1346 |
95.9092 |
0.0904 |
3.5945 |
5000 |
2.1195 |
95.0311 |
0.0937 |
3.7743 |
5250 |
1.7738 |
95.1810 |
0.0836 |
3.9540 |
5500 |
2.0081 |
95.1167 |
0.0525 |
4.1337 |
5750 |
2.3687 |
94.9240 |
0.0562 |
4.3134 |
6000 |
2.2252 |
95.1381 |
0.0506 |
4.4932 |
6250 |
2.5513 |
95.5022 |
0.0592 |
4.6729 |
6500 |
2.5357 |
95.6736 |
0.0521 |
4.8526 |
6750 |
2.4758 |
95.8235 |
0.0276 |
5.0324 |
7000 |
2.8255 |
94.9454 |
0.0278 |
5.2121 |
7250 |
2.6255 |
94.7740 |
0.0311 |
5.3918 |
7500 |
3.0046 |
94.4956 |
0.0269 |
5.5715 |
7750 |
2.8301 |
94.7312 |
0.0242 |
5.7513 |
8000 |
2.8859 |
94.2386 |
0.0255 |
5.9310 |
8250 |
2.5873 |
93.4676 |
0.0157 |
6.1107 |
8500 |
3.4027 |
93.6175 |
0.0092 |
6.2904 |
8750 |
3.5842 |
93.6389 |
0.0118 |
6.4702 |
9000 |
3.2694 |
93.9602 |
0.0086 |
6.6499 |
9250 |
3.3464 |
93.5318 |
0.01 |
6.8296 |
9500 |
3.4414 |
93.4461 |
0.0065 |
7.0093 |
9750 |
3.3491 |
92.6108 |
0.002 |
7.1891 |
10000 |
3.5889 |
92.3967 |
Framework Versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
📄 License
This model is released under the apache - 2.0 license.