SpeechT5_TTS_Haitian Open-Source Text-to-Speech Model - Free Convert Haitian Creole Text to Speech

Speecht5 TTS Haitian

Developed by idajikuu

A Haitian Creole text-to-speech model fine-tuned based on the SpeechT5 architecture, trained using Carnegie Mellon University's Haitian language dataset

Speech Synthesis

Transformers

Other#Haitian Creole TTS #Low-resource language synthesis #Multilingual speech synthesis

Downloads 139

Release Time : 7/23/2023

Model Overview

This model can convert input Haitian Creole text into corresponding speech, suitable for applications such as audiobook narration and voice assistants

Model Features

Haitian Creole support

A speech synthesis model specifically fine-tuned and optimized for Haitian Creole

Based on SpeechT5 architecture

Utilizes the advanced SpeechT5 text-to-speech conversion architecture to ensure high-quality speech synthesis

Trained with professional dataset

Trained using the professional Haitian language dataset provided by Carnegie Mellon University

Model Capabilities

Haitian Creole text-to-speech

Speech synthesis

Use Cases

Education

Audiobook narration

Convert Haitian Creole books into speech

Provides a natural and smooth narration experience

Smart assistants

Voice assistant

Provides voice interaction functionality for Haitian Creole users

🚀 Fine-tuned SpeechT5 TTS Model for Haitian Creole

This model is a fine - tuned version of [microsoft/speecht5 - tts](https://huggingface.co/microsoft/speecht5 - tts) for the Haitian Creole language, enabling text - to - speech conversion in this language.

🚀 Quick Start

This fine - tuned SpeechT5 TTS model is ready to be used for text - to - speech applications in Haitian Creole. You can start integrating it into your projects right away.

✨ Features

Language - Specific: Specifically fine - tuned for the Haitian Creole language, allowing for accurate speech synthesis from Haitian Creole text.
Based on SpeechT5: Utilizes the SpeechT5 architecture, a specialized variant of T5 for text - to - speech tasks.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples are provided in the original document, so this section is skipped.

📚 Documentation

Model Description

The model is based on the SpeechT5 architecture, which is a variant of the T5 (Text - to - Text Transfer Transformer) model designed specifically for text - to - speech tasks. The model is capable of converting input text in Haitian Creole into corresponding speech.

Intended Uses & Limitations

The model is intended for text - to - speech (TTS) applications in Haitian Creole language processing. It can be used for generating speech from written text, enabling applications such as audiobook narration, voice assistants, and more.

However, there are some limitations to be aware of:

The model's performance heavily depends on the quality and diversity of the training data. Fine - tuning on more diverse and specific datasets might improve its performance.
Like all machine learning models, this model may produce inaccuracies or errors in speech synthesis, especially for complex sentences or domain - specific jargon.

Training and Evaluation Data

The model was fine - tuned on the CMU Haitian dataset, which contains text and corresponding audio samples in Haitian Creole. The dataset was split into training and evaluation sets to assess the model's performance.

Training Procedure

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e - 05
per_device_train_batch_size: 16
gradient_accumulation_steps: 2
warmup_steps: 500
max_steps: 4000
gradient_checkpointing: True
fp16: True
evaluation_strategy: no
per_device_eval_batch_size: 8
save_steps: 1000
logging_steps: 25
report_to: ["tensorboard"]
greater_is_better: False

Training Results

The training progress and evaluation results are as follows:

Training Loss	Epoch	Step	Validation Loss
0.5147	2.42	1000	0.4753
0.4932	4.84	2000	0.4629
0.4926	7.26	3000	0.4566
0.4907	9.69	4000	0.4542
0.4839	12.11	5000	0.4532

Training Output

The training was completed with the following output:

Global Step: 4000
Training Loss: 0.3344
Training Runtime: 7123.63 seconds
Training Samples per Second: 17.97
Training Steps per Second: 0.562
Total FLOPs: 1.1690e+16

Framework Versions

Transformers 4.31.0
PyTorch 2.0.1+cu118
Datasets 2.13.1
Tokenizers 0.13.3

🔧 Technical Details

The model is a fine - tuned version of [microsoft/speecht5 - tts](https://huggingface.co/microsoft/speecht5 - tts) for the Haitian Creole language. It leverages the SpeechT5 architecture, which is optimized for text - to - speech tasks. The fine - tuning process on the CMU Haitian dataset allows it to generate speech in Haitian Creole.

📄 License

No license information is provided in the original document, so this section is skipped.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご