🚀 F5-TTS Italian Finetune
This is an Italian finetune for F5-TTS, aiming to provide high - quality text - to - speech service for Italian.
🚀 Quick Start
This project is an Italian finetune for F5 - TTS. It has some limitations and characteristics as described below.
✨ Features
- Language Specific: This model is specifically finetuned for Italian and cannot speak English properly.
- Training Data: It is trained over 247+ hours of the "train" split of the facebook/multilingual_librispeech dataset, with 6717 steps per epoch.
- Model Status: There was a catastrophic failure where the model forgot English, and the Italian pronunciation is not perfect. However, there are many checkpoints available for further training, possibly with different datasets.
📦 Installation
No installation steps are provided in the original document, so this section is skipped.
💻 Usage Examples
The run.py
file is an example of how to extract the wav files and produce the metadata.csv
to use for training.
📚 Documentation
Current most trained model
The most trained model is italian_59kh/model_464400.safetensors
(approximately 70 Epoch).
Folder Structure
| - italian_59kh
| | - checkpoints
italian_59kh
This folder contains the weights at specific steps. The higher the number, the further the model went into training. Note that the weights in this folder cannot be used to resume training; use the checkpoints
folder instead.
italian_59kh/checkpoints
This folder contains the weights of the checkpoints at specific steps. The higher the number, the further the model went into training. The weights in this folder can be used as a starting point to continue training.
📄 License
The project is licensed under cc - by - 4.0
.
⚠️ Important Note
UPDATE: A better version with improved prosody here => https://huggingface.co/alien79/F5 - TTS - italian *
💡 Usage Tip
The model has some limitations such as the model forgetting English and imperfect Italian pronunciation. You can use the available checkpoints to extend training, maybe with different datasets.
Property |
Details |
Datasets |
facebook/multilingual_librispeech |
Language |
it |
Base Model |
SWivid/F5 - TTS |
Pipeline Tag |
text - to - speech |
License |
cc - by - 4.0 |
Library Name |
f5 - tts |