Wav2vec2-base-timit-demo-colab1 Open-source Speech Recognition Model

Wav2vec2 Base Timit Demo Colab1

Developed by cuzeverynameistaken

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, trained and evaluated on the TIMIT dataset.

Downloads 16

Release Time : 5/1/2022

Model Overview

A speech recognition model based on the wav2vec2 architecture, suitable for English speech-to-text tasks.

Based on wav2vec2 Architecture

Uses Facebook's wav2vec2-base as the base model, with excellent speech feature extraction capabilities.

Fine-tuned Optimization

Fine-tuned on the TIMIT dataset, optimized for specific speech recognition tasks.

Moderate Performance

Achieves a word error rate (WER) of 0.4784 on the evaluation set.

English Speech Recognition

Speech-to-Text

Speech Transcription

English Speech Transcription

Convert English speech content into text

Word error rate 0.4784

Training Loss	Epoch	Step	Validation Loss	Wer
5.1915	13.89	500	3.1318	1.0
1.4993	27.78	1000	0.6736	0.5485
0.3416	41.67	1500	0.7111	0.5092
0.1937	55.56	2000	0.7170	0.4784

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base