W

Whisper Large V3 Voice Quality

Developed by tiantiaf
A voice quality classification model based on Whisper Large v3, used to analyze features such as pitch, timbre, volume, clarity, and rhythm of speech.
Downloads 162
Release Time : 5/22/2025

Model Overview

This model implements the voice quality classification method described in 'Vox-Profile: A Benchmark for Characterizing Diverse Speakers and Speech Features with Voice Foundation Models', capable of classifying multi-dimensional speech features.

Model Features

Multi-dimensional Voice Feature Analysis
Capable of analyzing multiple dimensions of speech features such as pitch, timbre, volume, clarity, and rhythm simultaneously.
Speaker-level Evaluation
Uses speaker-level macro-average F1 score for evaluation to ensure the representativeness of the results.
Efficient Audio Processing
Supports audio input up to 15 seconds in length, with a sampling rate of 16kHz and mono-channel processing.

Model Capabilities

Voice Quality Classification
Pitch Analysis
Timbre Analysis
Volume Analysis
Clarity Analysis
Rhythm Analysis

Use Cases

Speech Analysis
Speech Feature Labeling
Automatically labels speech samples with features such as pitch and timbre.
Provides detailed speech feature classification results
Speaker Feature Analysis
Analyzes the speech feature patterns of speakers.
Generates speaker-level speech feature reports
Speech Research
Speech Feature Research
Used for research on the correlation between speech features and speaker characteristics.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase