B

Bigvgan V2 22khz 80band 256x

Developed by nvidia
BigVGAN is a general-purpose neural vocoder trained at scale, capable of generating high-quality audio waveforms from mel spectrograms.
Downloads 503.23k
Release Time : 7/15/2024

Model Overview

BigVGAN is a high-performance neural vocoder that supports various audio types including speech, environmental sounds, and musical instruments through large-scale training. The latest v2 version significantly improves inference speed with custom CUDA kernels.

Model Features

High-performance inference
Achieves 1.5-3x inference speed improvement with custom CUDA kernels
Large-scale training
Trained on diverse audio datasets to support multiple audio types
High-quality audio generation
Achieves state-of-the-art results on benchmarks like LibriTTS
Multi-configuration support
Provides pretrained models with various sampling rates (22kHz/24kHz/44kHz) and upsampling factors

Model Capabilities

Generate high-quality audio from mel spectrograms
Support audio generation at various sampling rates
Fast inference (using CUDA kernels)

Use Cases

Speech synthesis
TTS system backend
Serves as the vocoder component for text-to-speech systems
Generates natural and fluent speech
Audio enhancement
Audio super-resolution
Enhances sampling rate and clarity of low-quality audio
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase