🚀 speecht5_finetuned_facebook_voxpopuli_french
This model is a fine - tuned version of microsoft/speecht5_tts on the voxpopuli dataset. It can convert text to speech, achieving a loss of 0.4379 on the evaluation set.
🚀 Quick Start
This model is a fine - tuned version of microsoft/speecht5_tts on the voxpopuli dataset.
It achieves the following results on the evaluation set:
📚 Documentation
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e - 05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 30
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
0.4872 |
1.0 |
1584 |
0.4663 |
0.4656 |
2.0 |
3168 |
0.4642 |
0.4686 |
3.0 |
4752 |
0.4533 |
0.4576 |
4.0 |
6336 |
0.4479 |
0.4658 |
5.0 |
7920 |
0.4485 |
0.4536 |
6.0 |
9504 |
0.4443 |
0.4559 |
7.0 |
11088 |
0.4426 |
0.449 |
8.0 |
12672 |
0.4410 |
0.4469 |
9.0 |
14256 |
0.4420 |
0.4565 |
10.0 |
15840 |
0.4402 |
0.4428 |
11.0 |
17424 |
0.4470 |
0.4412 |
12.0 |
19008 |
0.4400 |
0.4437 |
13.0 |
20592 |
0.4396 |
0.4395 |
14.0 |
22176 |
0.4385 |
0.4461 |
15.0 |
23760 |
0.4407 |
0.4401 |
16.0 |
25344 |
0.4387 |
0.4407 |
17.0 |
26928 |
0.4379 |
0.4359 |
18.0 |
28512 |
0.4384 |
0.4338 |
19.0 |
30096 |
0.4387 |
0.4326 |
20.0 |
31680 |
0.4381 |
0.4406 |
21.0 |
33264 |
0.4390 |
0.437 |
22.0 |
34848 |
0.4387 |
0.4357 |
23.0 |
36432 |
0.4389 |
0.4309 |
24.0 |
38016 |
0.4387 |
0.441 |
25.0 |
39600 |
0.4379 |
0.4355 |
26.0 |
41184 |
0.4378 |
0.4312 |
27.0 |
42768 |
0.4380 |
0.4328 |
28.0 |
44352 |
0.4388 |
0.4289 |
29.0 |
45936 |
0.4380 |
0.4291 |
30.0 |
47520 |
0.4379 |
Framework versions
- Transformers 4.30.0.dev0
- Pytorch 2.0.1+cu117
- Datasets 2.13.1
- Tokenizers 0.13.3
📄 License
This model is licensed under the MIT license.
Property |
Details |
Model Type |
Fine - tuned version of microsoft/speecht5_tts on the voxpopuli dataset |
Training Data |
voxpopuli |
Pipeline Tag |
text - to - speech |
Base Model |
microsoft/speecht5_tts |