Open-source model speecht5_finetuned_fleurs_zh_4000 - Free deployment to achieve Chinese speech synthesis

Speecht5 Finetuned Fleurs Zh 4000

Developed by GCYY

This model is a speech synthesis (TTS) model fine-tuned on the fleurs dataset based on microsoft/speecht5_tts, supporting Chinese speech generation.

Speech Synthesis

Transformers

Open Source License:MIT #Chinese Speech Synthesis #Low-resource Fine-tuning #TTS Optimization

Downloads 15

Release Time : 9/6/2023

Model Overview

This is a speech synthesis model optimized for Chinese, with improved performance in Chinese speech generation tasks through fine-tuning.

Model Features

Chinese Optimization

Fine-tuned on the fleurs Chinese dataset, optimized for Chinese speech generation

Efficient Training

Achieves good results with only 4000 training steps

Low Loss

Achieved a low loss value of 0.3888 on the evaluation set

Model Capabilities

Chinese Text-to-Speech Conversion

High-quality Speech Synthesis

Use Cases

Speech Synthesis Applications

Voice Assistants

Provides natural Chinese voice output for smart devices

Audiobooks

Converts Chinese text into natural speech

🚀 speecht5_finetuned_fleurs_zh_4000

This model is a fine - tuned version of microsoft/speecht5_tts on the fleurs dataset, which can achieve better performance in relevant speech tasks.

🚀 Quick Start

This model is a fine - tuned version of microsoft/speecht5_tts on the fleurs dataset. It achieves the following results on the evaluation set:

Loss: 0.3888

📚 Documentation

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e - 05
train_batch_size: 4
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 4000

Training results

Training Loss	Epoch	Step	Validation Loss
0.7366	1.09	100	0.6059
0.5892	2.19	200	0.5104
0.5436	3.28	300	0.4585
0.4848	4.38	400	0.4333
0.4733	5.47	500	0.4276
0.4534	6.57	600	0.4194
0.454	7.66	700	0.4172
0.4489	8.76	800	0.4111
0.4401	9.85	900	0.4108
0.441	10.94	1000	0.4136
0.437	12.04	1100	0.4078
0.4333	13.13	1200	0.4067
0.4328	14.23	1300	0.4002
0.4289	15.32	1400	0.4015
0.4254	16.42	1500	0.4012
0.427	17.51	1600	0.4020
0.4273	18.6	1700	0.4008
0.4222	19.7	1800	0.3966
0.4305	20.79	1900	0.3998
0.4198	21.89	2000	0.3954
0.4225	22.98	2100	0.3961
0.4223	24.08	2200	0.3965
0.4201	25.17	2300	0.3922
0.4234	26.27	2400	0.3939
0.4213	27.36	2500	0.3930
0.4182	28.45	2600	0.3934
0.4119	29.55	2700	0.3925
0.4113	30.64	2800	0.3907
0.4131	31.74	2900	0.3907
0.4135	32.83	3000	0.3933
0.4142	33.93	3100	0.3909
0.4144	35.02	3200	0.3919
0.414	36.11	3300	0.3919
0.418	37.21	3400	0.3899
0.4094	38.3	3500	0.3897
0.4149	39.4	3600	0.3924
0.4105	40.49	3700	0.3905
0.413	41.59	3800	0.3895
0.4117	42.68	3900	0.3900
0.4096	43.78	4000	0.3888

Framework versions

Transformers 4.33.0
Pytorch 2.0.1+cu118
Datasets 2.14.4
Tokenizers 0.13.3

📄 License

This model is released under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご