VIT-VoxCelebSpoof Open-source Synthetic Speech Detection Model - Efficiently and Accurately Identify Synthetic Speech

VIT VoxCelebSpoof Mel Spectrogram Synthetic Voice Detection

Developed by MattyB95

A synthetic voice detection model based on deep learning, which achieves efficient and accurate synthetic voice detection by fine-tuning the pre-trained model.

Speaker Analysis

Transformers

EnglishOpen Source License:MIT #High-precision voice detection #Synthetic voice recognition #Voice security protection

Downloads 788

Release Time : 1/23/2024

Model Overview

This model is a synthetic voice detection model based on the Vision Transformer (ViT) architecture, specifically designed to identify synthetic voice features in Mel spectrograms and provide technical support for the field of voice security.

Model Features

High-accuracy detection

Achieved 100% accuracy, F1 score, precision, and recall on the evaluation set.

Fine-tuning based on pre-trained model

Fine-tuned on the basis of google/vit-base-patch16-224-in21k, making full use of the visual feature extraction ability of the pre-trained model.

Efficient Mel spectrogram analysis

Optimized for the Mel spectrogram features of voice signals.

Model Capabilities

Synthetic voice detection

Audio classification

Mel spectrogram analysis

Use Cases

Voice security

Enhancement of voice authentication system

Used to detect synthetic voice attacks in the voice authentication system.

Can effectively identify synthetic voices and prevent deception attacks.

Audio content review

Detect whether the audio content contains synthetic voices.

Help the platform identify potential AI-generated voice content.

Training Loss	Epoch	Step	Accuracy	F1	Validation Loss	Precision	Recall
0.0048	1.0	29527	0.9998	0.9999	0.0010	0.9998	1.0
0.0	2.0	59054	0.0006	0.9999	0.9999	0.9999	0.9999
0.0	3.0	88581	0.0002	1.0000	1.0000	1.0000	1.0

Property	Details
Model Type	VIT - VoxCelebSpoof - Mel_Spectrogram - Synthetic - Voice - Detection
Base Model	google/vit - base - patch16 - 224 - in21k
Tags	generated_from_trainer
Metrics	accuracy, f1, precision, recall
Datasets	MattyB95/VoxCelebSpoof

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

VIT VoxCelebSpoof Mel Spectrogram Synthetic Voice Detection

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 VIT-VoxCelebSpoof-Mel_Spectrogram-Synthetic-Voice-Detection

🚀 Quick Start

🔧 Technical Details

Training hyperparameters

Training results

Framework versions

📄 License