U

Unispeech Sat Base 100h Libri Ft

Developed by microsoft
An automatic speech recognition model based on the UniSpeech-SAT base model, fine-tuned on 100 hours of LibriSpeech data
Downloads 643
Release Time : 3/2/2022

Model Overview

This is a model specifically designed for automatic speech recognition (ASR), based on Microsoft's UniSpeech-SAT architecture, enhanced with self-supervised learning for speaker representation capabilities, suitable for English speech-to-text tasks

Model Features

Speaker-Aware Pretraining
Enhances speaker representation learning by combining utterance-level contrastive loss with SSL objectives
Sentence Mixing Data Augmentation
Employs an innovative sentence mixing strategy to generate overlapping utterances unsupervised during training, improving the model's ability to distinguish speakers
Large-Scale Pretraining
The original model was pretrained on 94,000 hours of public audio data, providing strong generalization capabilities

Model Capabilities

English Speech Recognition
Speaker Feature Extraction
16kHz Sampled Audio Processing

Use Cases

Speech-to-Text
Speech Transcription
Converts English speech content into text
Performs well on the LibriSpeech dataset
Speech Analysis
Speaker Identification
Extracts speaker features from speech
Paper shows excellent performance on the SUPERB benchmark
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase