D

Diar Sortformer 4spk V1

Developed by nvidia
An end-to-end speaker diarization model based on the Sortformer architecture, which resolves permutation issues in diarization by ordering speech segments according to speaker arrival time, supporting recognition of up to 4 speakers.
Downloads 385.49k
Release Time : 12/9/2024

Model Overview

This model employs the innovative Sortformer architecture, specifically designed for speaker diarization tasks, effectively addressing speaker identification and speech segment ordering in multi-speaker conversation scenarios.

Model Features

Innovative Sortformer Architecture
Utilizes a different training objective from existing end-to-end diarization models, resolving permutation issues by ordering speech segments according to speaker arrival time.
High-Performance Speaker Identification
Achieves a DER of 14.76% on the DIHARD3 evaluation set, with DER as low as 5.85% in two-speaker call scenarios.
Multi-Speaker Support
Capable of identifying up to 4 speakers simultaneously, suitable for multi-party interaction scenarios such as meeting transcriptions and customer service dialogues.
Efficient Processing Capability
Can process approximately 12 minutes of audio on an RTX A6000 GPU, meeting the needs of most practical applications.

Model Capabilities

Speaker identification
Speech segment ordering
Multi-speaker conversation analysis
Offline speech processing

Use Cases

Meeting transcription
Meeting Speaker Identification
Automatically identifies and orders speech segments from different speakers in meeting recordings
DER as low as 6.86% (American English home calls)
Customer service analysis
Customer Service Dialogue Analysis
Identifies dialogue segments between customer service representatives and customers
DER of 5.85% in two-speaker conversations
Transcription assistance
Multi-Speaker Speech Transcription
Provides speaker segmentation information for speech transcription systems
DER of 8.46% in three-speaker conversations
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase