W

Wav2vec Osr

Developed by iamtarun
A fine-tuned Facebook wav2vec2 model for the speech-to-text module of The Sound of AI Open Source Research Group
Downloads 22
Release Time : 3/2/2022

Model Overview

A speech recognition model based on wav2vec2, supporting speech-to-text conversion. The original model was pre-trained and fine-tuned on 960 hours of Librispeech audio data, suitable for 16kHz sampled speech input.

Model Features

Efficient Speech Recognition
Achieves high-quality speech recognition even with limited labeled data
Pre-training and Fine-tuning Integration
Pre-trained on large amounts of unlabeled speech data, then fine-tuned on labeled data
Contrastive Learning
Learns speech representations using latent space masking and contrastive tasks

Model Capabilities

Speech-to-Text
English Speech Recognition

Use Cases

Speech Transcription
Meeting Minutes
Automatically convert meeting recordings into text transcripts
Voice Notes
Convert voice notes into searchable text
Assistive Technology
Hearing Assistance
Provide real-time speech-to-text services for the hearing impaired
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase