W

Wav2vec2 Large Lv60 Timit

Developed by harshit345
A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-large-lv60, supporting 16kHz sampled speech input.
Downloads 21
Release Time : 3/2/2022

Model Overview

This model is an Automatic Speech Recognition (ASR) system for English speech recognition, fine-tuned on the TIMIT dataset, capable of converting English speech into text.

Model Features

High Accuracy Speech Recognition
Achieves a 13.5% Word Error Rate (WER) on the TIMIT test set
No Language Model Required
Can be used directly without additional language model support
16kHz Sampling Rate Support
Optimized for 16kHz sampled speech input

Model Capabilities

English Speech Recognition
Real-time Speech-to-Text
Audio Transcription

Use Cases

Speech Transcription
Automatic Meeting Minutes Transcription
Automatically convert meeting recordings into text transcripts
Approximately 86.5% accuracy
Voice Command Recognition
Recognize and process voice commands
Education
Pronunciation Evaluation
Assist language learners in evaluating pronunciation accuracy
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase