The open-source text-to-speech model `speecht5_finetuned_voxpopuli_pl`

Home

Speecht5 Finetuned Voxpopuli Pl

Developed by weiren119

A text-to-speech model fine-tuned on the VoxPopuli dataset based on microsoft/speecht5_tts

Speech Synthesis

Transformers

Open Source License:MIT #Speech synthesis #Multilingual support #Low-resource optimization

Downloads 38

Release Time : 7/29/2023

Model Overview

This model is a text-to-speech (TTS) implementation of the SpeechT5 architecture, specifically fine-tuned on the VoxPopuli dataset, capable of converting text into natural speech.

Model Features

High-quality speech synthesis

Based on the SpeechT5 architecture, it can generate natural and fluent speech output

Domain-specific fine-tuning

Specifically fine-tuned on the VoxPopuli dataset, which may be more suitable for speech generation with characteristics of this dataset

Efficient training

Trained with a relatively small batch size (32) and a moderate number of training steps (2000)

Model Capabilities

Text-to-speech

Speech synthesis

Use Cases

Voice applications

Voice assistant

Provide natural speech output for virtual assistants

Audiobook generation

Convert text content into speech format

🚀 speecht5_finetuned_voxpopuli_pl

This is a fine - tuned Text - to - Speech model based on the microsoft/speecht5_tts model, trained on the voxpopuli dataset.

🚀 Quick Start

This model is a fine - tuned version of microsoft/speecht5_tts on the voxpopuli dataset. It achieves the following results on the evaluation set:

Loss: 0.4550

📚 Documentation

✨ Features

This model is fine - tuned on the voxpopuli dataset, enhancing its performance in text - to - speech tasks.

📦 Installation

No installation steps were provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples were provided in the original document, so this section is skipped.

🔧 Technical Details

Training and Evaluation Data

The model is trained on the voxpopuli dataset. However, more detailed information about the training and evaluation data is not provided.

Training Procedure

Training Hyperparameters

The following hyperparameters were used during training:

Property	Details
learning_rate	1e - 05
train_batch_size	4
eval_batch_size	2
seed	42
gradient_accumulation_steps	8
total_train_batch_size	32
optimizer	Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type	linear
lr_scheduler_warmup_steps	500
training_steps	2000

Training Results

Training Loss	Epoch	Step	Validation Loss
0.6954	0.5	100	0.6110
0.644	1.01	200	0.5731
0.602	1.51	300	0.5330
0.5524	2.01	400	0.4982
0.5412	2.51	500	0.4870
0.5256	3.02	600	0.4775
0.5141	3.52	700	0.4728
0.5125	4.02	800	0.4688
0.5106	4.52	900	0.4657
0.5037	5.03	1000	0.4627
0.5048	5.53	1100	0.4622
0.4983	6.03	1200	0.4583
0.4981	6.53	1300	0.4580
0.4942	7.04	1400	0.4580
0.4945	7.54	1500	0.4578
0.4922	8.04	1600	0.4568
0.4893	8.54	1700	0.4562
0.4948	9.05	1800	0.4552
0.4892	9.55	1900	0.4547
0.4933	10.05	2000	0.4550

Framework Versions

Transformers 4.31.0
Pytorch 2.0.1+cu117
Datasets 2.14.0
Tokenizers 0.13.3

📄 License

This model is released under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご