S

Sew D Base Plus 400k Ft Ls100h

Developed by asapp
SEW-D-base+ is an efficient speech recognition model developed by ASAPP Research, pre-trained on 16kHz sampled speech audio, and excels on the LibriSpeech dataset.
Downloads 66
Release Time : 3/2/2022

Model Overview

This model is an efficient automatic speech recognition (ASR) model optimized for downstream tasks such as speech recognition, speaker recognition, intent classification, etc. Compared to wav2vec 2.0, it significantly improves inference efficiency while maintaining performance.

Model Features

Efficient Inference
Achieves 1.9x inference speedup compared to wav2vec 2.0
Performance Optimization
Reduces word error rate by 13.5% in the LibriSpeech 100h-960h semi-supervised setting
Multi-Task Adaptation
Can be fine-tuned for various downstream tasks, including speech recognition, speaker recognition, intent classification, etc.

Model Capabilities

Speech Recognition
Speaker Recognition
Intent Classification
Emotion Recognition

Use Cases

Speech Transcription
Meeting Transcription
Automatically transcribes meeting recordings into text records
WER 4.34 on the LibriSpeech clean test set
Voice Assistant
Used as the speech recognition module for smart voice assistants
WER 9.45 on the LibriSpeech other test set
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase