đ Whisper Small Sl - samolego
This model is a fine - tuned version of openai/whisper-small on the ASR database ARTUR 1.0 (audio) dataset. It addresses the need for accurate speech - to - text conversion in the Slovenian language, leveraging the pre - trained capabilities of the base model and fine - tuning them on specific datasets. It achieves notable results on the evaluation set, including a low loss and a relatively good word error rate (Wer).
⨠Features
- Multiple Formats: Both
ggml
and safetensors
formats are available.
- Fine - Tuned Performance: Achieves a Loss of 0.1226 and a Wer of 11.0097 on the evaluation set.
đĻ Installation
No installation steps were provided in the original document, so this section is skipped.
đģ Usage Examples
No code examples were provided in the original document, so this section is skipped.
đ Documentation
Model description
Both ggml
and safetensors
formats are available.
If you're not familiar with ggml, I'd suggest checking out whisper.cpp.
Intended uses & limitations
More information needed
Training and evaluation data
Verdonik, Darinka; et al., 2023,
ASR database ARTUR 1.0 (audio), Slovenian language resource repository CLARIN.SI, ISSN 2820 - 4042,
http://hdl.handle.net/11356/1776.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e - 05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Wer |
0.2778 |
0.07 |
500 |
0.2748 |
23.0421 |
0.2009 |
0.14 |
1000 |
0.1972 |
17.3073 |
0.1643 |
0.21 |
1500 |
0.1658 |
14.5195 |
0.1569 |
0.28 |
2000 |
0.1495 |
13.1550 |
0.1344 |
0.36 |
2500 |
0.1380 |
12.2945 |
0.1295 |
0.43 |
3000 |
0.1302 |
11.6237 |
0.1239 |
0.5 |
3500 |
0.1249 |
11.2128 |
0.1178 |
0.57 |
4000 |
0.1226 |
11.0097 |
Framework versions
- Transformers 4.39.0.dev0
- Pytorch 2.0.1+cu117
- Datasets 2.18.0
- Tokenizers 0.15.2
đ§ Technical Details
The model is a fine - tuned version of [openai/whisper - small](https://huggingface.co/openai/whisper - small) on specific ASR databases. It uses a set of well - defined hyperparameters during training, such as a learning rate of 5e - 05 and an Adam optimizer. The training was carried out over 3 epochs, and the model was evaluated on specific metrics like Loss and Wer.
đ License
This model is released under the Apache 2.0 license.