B

Bigvgan Melspec

Developed by cckm
A neural vocoder based on BigVGAN, trained with specific mel-spectrogram inputs, suitable for high-quality audio generation tasks
Downloads 16
Release Time : 1/11/2025

Model Overview

This model is an improved version of NVIDIA's BigVGAN, optimized for specific mel-spectrogram inputs, primarily used for audio-to-audio conversion tasks, capable of generating high-quality audio output.

Model Features

Optimized Mel-Spectrogram Input
Uses specifically configured mel-spectrograms as input, potentially improving audio generation quality
High PESQ Score
Achieved a PESQ score of 4.340 in evaluation, close to the original NVIDIA checkpoint's score of 4.362
Compatible with Various Mel-Spectrogram Configurations
Supports mel-spectrogram features generated by the vocos library

Model Capabilities

Audio generation
Mel-spectrogram conversion
High-quality speech synthesis

Use Cases

Speech Synthesis
Text-to-Speech Systems
Used as a neural vocoder for the backend of TTS systems
Generates high-quality speech output
Audio Enhancement
Speech Quality Improvement
Used to enhance the clarity and naturalness of low-quality audio
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase