Wav2vec2 Large Xlsr Deepfake Audio Classification
W
Wav2vec2 Large Xlsr Deepfake Audio Classification
Developed by Gustking
An audio classification model based on the wav2vec2 architecture, fine-tuned for deepfake audio detection tasks, excelling in gender recognition and fake audio detection.
Downloads 345
Release Time : 5/15/2024
Model Overview
This model is an audio classification model based on the wav2vec2 architecture, specifically fine-tuned for deepfake audio detection tasks. It is primarily used to identify gender characteristics in audio and detect fake audio, demonstrating excellent performance on datasets such as ASVspoof2019.
Model Features
High-precision fake audio detection
Achieves an F1 score of 0.9363 on the ASVspoof2019 evaluation subset with an equal error rate of only 0.0401
Excellent gender recognition capability
Achieves an F1 score of 0.95 on the original evaluation data with a loss value of only 0.4056
Based on wav2vec2 architecture
Utilizes the powerful wav2vec2-large-xlsr-53 architecture, featuring outstanding audio feature extraction capabilities
Model Capabilities
Audio classification
Gender recognition
Deepfake audio detection
Use Cases
Security detection
Fake audio identification
Used to detect fake audio such as speech synthesis or voice conversion
Achieves an accuracy of 92.86% on the ASVspoof2019 dataset
Speech analysis
Speaker gender recognition
Identifies the gender characteristics of speakers in audio
Achieves an F1 score of 0.95
Featured Recommended AI Models
Š 2025AIbase