Wespeaker Voxceleb Resnet293 LM
A speaker embedding model based on ResNet293 architecture, optimized with large margin fine-tuning, supporting tasks such as speaker recognition, similarity calculation, and speech segmentation
Downloads 108
Release Time : 12/28/2023
Model Overview
This model is provided by the Wespeaker project, utilizing the ResNet293 architecture and optimized with large margin fine-tuning, primarily for speaker recognition and speech processing tasks. Trained on the VoxCeleb2 development dataset, it includes 5994 speakers.
Model Features
Large Margin Fine-Tuning Optimization
Optimizes model performance using large margin fine-tuning technology, significantly improving speaker recognition accuracy
Efficient Architecture
Based on ResNet293 architecture, maintaining high performance while controlling computational load
Multi-Task Support
Supports various tasks including speaker embedding extraction, similarity calculation, and speech segmentation
Model Capabilities
Speaker Recognition
Speaker Similarity Calculation
Speech Segmentation
Speaker Enrollment and Recognition
Use Cases
Voice Biometrics
Speaker Verification
Verify whether an audio sample belongs to a specific speaker
EER of 0.447 on the VoxCeleb test set
Speech Analysis
Meeting Speech Segmentation
Identify and segment different speakers in meeting recordings
Featured Recommended AI Models
Š 2025AIbase