T

Tts Hifigan

Developed by nvidia
HiFiGAN is a Generative Adversarial Network (GAN) model capable of generating high-quality audio from mel-spectrograms, suitable for text-to-speech systems.
Downloads 5,022
Release Time : 6/29/2022

Model Overview

This model is a vocoder for text-to-speech systems that converts mel-spectrograms into natural speech. Based on GAN architecture, it is particularly suitable for use with spectrogram generation models like FastPitch.

Model Features

High-quality audio generation
Uses GAN architecture to generate high-fidelity speech with an output sampling rate of 22050Hz
Efficient training
Employs multi-scale and multi-period discriminators to improve training stability
Riva compatibility
Can be integrated with NVIDIA Riva Speech AI SDK for efficient deployment

Model Capabilities

Mel-spectrogram to audio conversion
Speech synthesis
High-fidelity audio generation

Use Cases

Speech synthesis systems
Text-to-speech systems
Works with models like FastPitch to build a complete TTS pipeline
Generates natural and fluent American English speech
Voice assistants
Provides high-quality speech output for conversational systems
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase