Sew D Tiny 100k
SEW-D is a compressed and efficient speech pre-training model developed by ASAPP Research, pre-trained on 16kHz sampled speech audio, suitable for various downstream speech tasks.
Downloads 1,074
Release Time : 3/2/2022
Model Overview
SEW-D is an efficient speech pre-training model specifically designed for tasks such as automatic speech recognition, achieving dual improvements in performance and efficiency through optimized architecture.
Model Features
Efficient Inference
Achieves 1.9x inference speedup compared to wav2vec 2.0.
Performance Improvement
Reduces word error rate by 25%-50% under similar inference time.
Optimized Architecture
Achieves dual improvements in performance and efficiency through systematic architectural design analysis.
Model Capabilities
Speech recognition
Speaker recognition
Intent classification
Emotion recognition
Use Cases
Speech processing
Automatic Speech Recognition
Convert speech to text
Relative reduction of 13.5% in word error rate on the LibriSpeech dataset
Speaker Recognition
Identify different speakers' identities
Featured Recommended AI Models