Open-source Speech Recognition Model wav2vec2-timit-demo - Free Deployment for Precise Speech Content Recognition

Home

Wav2vec2 Timit Demo

Developed by asini

A speech recognition model fine-tuned on the TIMIT dataset based on the facebook/wav2vec2-base model

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech Recognition #Low Word Error Rate #TIMIT Dataset

Downloads 21

Release Time : 3/2/2022

Model Overview

This is a pre-trained model for English speech recognition, achieving good word error rate performance through fine-tuning on the TIMIT dataset.

Model Features

Efficient Fine-tuning

Fine-tuned based on the powerful wav2vec2-base model, fully leveraging the advantages of the pre-trained model

Low Word Error Rate

Achieved a word error rate (WER) of 34.62% on the TIMIT dataset

Lightweight

Based on the wav2vec2-base architecture, more computationally efficient compared to larger models

Model Capabilities

English Speech Recognition

Audio to Text

Speech Content Analysis

Use Cases

Speech Transcription

Meeting Minutes

Automatically convert English meeting recordings into text transcripts

Accuracy approximately 65.38% (based on 1-WER calculation)

Voice Notes

Convert English voice notes into searchable text

Speech Analysis

Speech Content Analysis

Analyze keywords and topics in speech content

Training Loss	Epoch	Step	Validation Loss	Wer
3.487	4.0	500	1.3466	1.0153
0.6134	8.0	1000	0.4807	0.4538
0.2214	12.0	1500	0.4684	0.3984
0.1233	16.0	2000	0.5070	0.3779
0.0847	20.0	2500	0.4965	0.3705
0.0611	24.0	3000	0.4881	0.3535
0.0464	28.0	3500	0.4847	0.3462

Property	Details
Model Type	wav2vec2 - timit - demo
License	Apache - 2.0

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Timit Demo

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 wav2vec2-timit-demo

🚀 Quick Start

🔧 Technical Details

Training hyperparameters

Training results

Framework versions

📄 License