Sew-d-tiny-100k Open-source Speech Pretrained Model - Suitable for Multiple Downstream Speech Tasks

Sew D Tiny 100k

Developed by asapp

SEW-D is a compressed and efficient speech pre-training model developed by ASAPP Research, pre-trained on 16kHz sampled speech audio, suitable for various downstream speech tasks.

Speech Recognition

Transformers

EnglishOpen Source License:Apache-2.0 #Speech recognition pre-training #Efficient inference acceleration #16kHz audio adaptation

Downloads 1,074

Release Time : 3/2/2022

Model Overview

SEW-D is an efficient speech pre-training model specifically designed for tasks such as automatic speech recognition, achieving dual improvements in performance and efficiency through optimized architecture.

Model Features

Efficient Inference

Achieves 1.9x inference speedup compared to wav2vec 2.0.

Performance Improvement

Reduces word error rate by 25%-50% under similar inference time.

Optimized Architecture

Achieves dual improvements in performance and efficiency through systematic architectural design analysis.

Model Capabilities

Speech recognition

Speaker recognition

Intent classification

Emotion recognition

Use Cases

Speech processing

Automatic Speech Recognition

Convert speech to text

Relative reduction of 13.5% in word error rate on the LibriSpeech dataset

Speaker Recognition

Identify different speakers' identities

Property	Details
Model Type	SEW - D - tiny
Training Datasets	librispeech_asr
Tags	speech

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Sew D Tiny 100k

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 SEW-D-tiny

🚀 Quick Start

✨ Features

💻 Usage Examples

Basic Usage

📄 License