Open-Source wav2vec2-random Automatic Speech Recognition Model

Home

Wav2vec2 Random

Developed by patrickvonplaten

An automatic speech recognition model fine-tuned on the TIMIT_ASR dataset based on the wav2vec2-base-random model

Speech Recognition

Transformers

#English speech recognition #TIMIT dataset fine-tuning #Low-resource optimization

Downloads 16

Release Time : 3/2/2022

Model Overview

This model is an implementation of the wav2vec2 architecture for English speech recognition, fine-tuned on the TIMIT_ASR dataset, capable of converting speech to text

Model Features

Based on wav2vec2 architecture

Utilizes the self-supervised learning architecture of wav2vec2 proposed by Facebook Research

Fine-tuned on TIMIT_ASR dataset

Fine-tuned on the standard TIMIT speech recognition dataset

Medium-sized model

Based on the wav2vec2-base architecture, suitable for environments with moderate computational resources

Model Capabilities

English speech recognition

Speech-to-text conversion

Use Cases

Speech transcription

Speech recording transcription

Convert English speech recordings into text transcripts

Achieves a word error rate of 0.8364 on the TIMIT evaluation set

Voice interface

Voice command recognition

Recognize simple English voice commands

Property	Details
Model Type	Fine - tuned version of patrickvonplaten/wav2vec2-base-random on the TIMIT_ASR - NA dataset
Evaluation Loss	3.1593
Evaluation Wer	0.8364

Training Loss	Epoch	Step	Validation Loss	Wer
2.9043	0.69	100	2.9683	1.0
2.8537	1.38	200	2.9281	0.9997
2.7803	2.07	300	2.7330	0.9999
2.6806	2.76	400	2.5792	1.0
2.4136	3.45	500	2.4327	0.9948
2.1682	4.14	600	2.3508	0.9877
2.2577	4.83	700	2.2176	0.9773
2.355	5.52	800	2.1753	0.9542
1.8588	6.21	900	2.0650	0.8851
1.6831	6.9	1000	2.0109	0.8618
1.888	7.59	1100	1.9660	0.8418
2.0066	8.28	1200	1.9847	0.8531
1.7044	8.97	1300	1.9760	0.8527
1.3168	9.66	1400	2.0708	0.8327
1.2143	10.34	1500	2.0601	0.8419
1.6189	11.03	1600	2.0960	0.8299
1.13	11.72	1700	2.2540	0.8408
0.8001	12.41	1800	2.4260	0.8306
0.7769	13.1	1900	2.4182	0.8445
1.2165	13.79	2000	2.3666	0.8284
0.8026	14.48	2100	2.7118	0.8662
0.5148	15.17	2200	2.7957	0.8526
0.4921	15.86	2300	2.8244	0.8346
0.7629	16.55	2400	2.8944	0.8370
0.5762	17.24	2500	3.0335	0.8367
0.4076	17.93	2600	3.0776	0.8358
0.3395	18.62	2700	3.1572	0.8261
0.4862	19.31	2800	3.1319	0.8414
0.5061	20.0	2900	3.1593	0.8364

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Random

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 wav2vec2-random

📚 Documentation

Model Details

Training and Evaluation

Training Hyperparameters

Training Results

Framework Versions