parler-tts-mini-jenny-30H Open-source TTS Model - Free Deployment to Achieve English Text-to-Speech

Home

Parler Tts Mini Jenny 30H

Developed by parler-tts

Jenny TTS is a text-to-speech model based on the transformers library, supporting English speech synthesis.

Speech Synthesis

Transformers

English#English Speech Synthesis #High-Quality TTS #Annotated Dataset Training

Downloads 1,215

Release Time : 4/15/2024

Model Overview

This model focuses on English text-to-speech tasks, capable of converting input text into natural speech output.

Model Features

High-Quality Speech Synthesis

Capable of generating natural and fluent English speech output.

Transformers-Based

Utilizes the powerful capabilities of the transformers library for speech synthesis.

English Support

Specialized in converting English text to speech.

Model Capabilities

English Text-to-Speech

Speech Synthesis

Use Cases

Voice Assistants

Virtual Assistant Voice Generation

Provides natural speech output for virtual assistants.

Enhances user experience

Audiobooks

Text-to-Speech Conversion

Converts text content into speech for audiobook production.

Facilitates visually impaired users

🚀 Parler-TTS Mini v0.1 - Jenny

This is a fine-tuned version of Parler-TTS Mini v0.1 on the 30-hours single-speaker high-quality Jenny (she's Irish ☘️) dataset, suitable for training a TTS model. Its usage is similar to Parler-TTS v0.1, just specify the keyword “Jenny” in the voice description.

🚀 Quick Start

📦 Installation

You can install the necessary library using the following command:

pip install git+https://github.com/huggingface/parler-tts.git

💻 Usage Examples

Basic Usage

You can use the model with the following inference snippet:

import torch
from parler_tts import ParlerTTSForConditionalGeneration
from transformers import AutoTokenizer
import soundfile as sf

device = "cuda:0" if torch.cuda.is_available() else "cpu"

model = ParlerTTSForConditionalGeneration.from_pretrained("parler-tts/parler-tts-mini-jenny-30H").to(device)
tokenizer = AutoTokenizer.from_pretrained("parler-tts/parler-tts-mini-jenny-30H")

prompt = "Hey, how are you doing today? My name is Jenny, and I'm here to help you with any questions you have."
description = "Jenny speaks at an average pace with an animated delivery in a very confined sounding environment with clear audio quality."

input_ids = tokenizer(description, return_tensors="pt").input_ids.to(device)
prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)

generation = model.generate(input_ids=input_ids, prompt_input_ids=prompt_input_ids)
audio_arr = generation.cpu().numpy().squeeze()
sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)

📚 Documentation

Fine-tuning Guide

Fine-tuning guide on Colab:

Citation

If you found this repository useful, please consider citing this work and also the original Stability AI paper:

@misc{lacombe-etal-2024-parler-tts,
  author = {Yoach Lacombe and Vaibhav Srivastav and Sanchit Gandhi},
  title = {Parler-TTS},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/huggingface/parler-tts}}
}

@misc{lyth2024natural,
      title={Natural language guidance of high-fidelity text-to-speech with synthetic annotations},
      author={Dan Lyth and Simon King},
      year={2024},
      eprint={2402.01912},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

📄 License

Attribution is required in software/websites/projects/interfaces (including voice interfaces) that generate audio in response to user action using this dataset. Attribution means: the voice must be referred to as "Jenny", and where at all practical, "Jenny (Dioco)". Attribution is not required when distributing the generated clips (although welcome). Commercial use is permitted. Don't do unfair things like claim the dataset is your own. No further restrictions apply.

Additional Resources

Model Information

Property	Details
Library Name	transformers
Tags	text-to-speech, annotation
Language	en
Pipeline Tag	text-to-speech
Inference	false
Datasets	ylacombe/jenny-tts-10k-tagged, reach-vb/jenny_tts_dataset

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご