VIT VoxCelebSpoof Mel Spectrogram Synthetic Voice Detection
V
VIT VoxCelebSpoof Mel Spectrogram Synthetic Voice Detection
Developed by MattyB95
A synthetic voice detection model based on deep learning, which achieves efficient and accurate synthetic voice detection by fine-tuning the pre-trained model.
Downloads 788
Release Time : 1/23/2024
Model Overview
This model is a synthetic voice detection model based on the Vision Transformer (ViT) architecture, specifically designed to identify synthetic voice features in Mel spectrograms and provide technical support for the field of voice security.
Model Features
High-accuracy detection
Achieved 100% accuracy, F1 score, precision, and recall on the evaluation set.
Fine-tuning based on pre-trained model
Fine-tuned on the basis of google/vit-base-patch16-224-in21k, making full use of the visual feature extraction ability of the pre-trained model.
Efficient Mel spectrogram analysis
Optimized for the Mel spectrogram features of voice signals.
Model Capabilities
Synthetic voice detection
Audio classification
Mel spectrogram analysis
Use Cases
Voice security
Enhancement of voice authentication system
Used to detect synthetic voice attacks in the voice authentication system.
Can effectively identify synthetic voices and prevent deception attacks.
Audio content review
Detect whether the audio content contains synthetic voices.
Help the platform identify potential AI-generated voice content.
Featured Recommended AI Models
Š 2025AIbase