wav2vec2-base_toy_train_data_augment_0.1 Open-source Speech Recognition Model - Accurately Implement Speech Recognition Function

Wav2vec2 Base Toy Train Data Augment 0.1

Developed by scasutt

A speech recognition model fine-tuned from facebook/wav2vec2-base, trained on a toy dataset with 0.1 ratio data augmentation applied

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech recognition fine-tuning #Low-resource data augmentation #High word error rate

Downloads 22

Release Time : 3/25/2022

Model Overview

This model is a fine-tuned version of wav2vec2-base, primarily used for speech recognition tasks, but currently exhibits poor performance (WER as high as 0.9954)

Model Features

Data augmentation training

Applied 0.1 ratio data augmentation technique during training

Based on wav2vec2 architecture

Uses facebook's wav2vec2-base as the base model

Model Capabilities

Speech recognition

Audio feature extraction

Use Cases

Speech processing

Speech-to-text

Convert speech content to text

Currently has high word error rate (WER=0.9954)

🚀 wav2vec2-base_toy_train_data_augment_0.1

This model is a fine - tuned version of facebook/wav2vec2-base on the None dataset. It offers evaluation results such as a loss of 3.3786 and a WER of 0.9954, providing a basis for speech - related tasks.

🚀 Quick Start

This model is a fine - tuned version of facebook/wav2vec2-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.3786
Wer: 0.9954

📚 Documentation

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

🔧 Technical Details

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
3.1342	1.05	250	3.3901	0.9954
3.0878	2.1	500	3.4886	0.9954
3.0755	3.15	750	3.4616	0.9954
3.0891	4.2	1000	3.5316	0.9954
3.0724	5.25	1250	3.2608	0.9954
3.0443	6.3	1500	3.3881	0.9954
3.0421	7.35	1750	3.4507	0.9954
3.0448	8.4	2000	3.4525	0.9954
3.0455	9.45	2250	3.3342	0.9954
3.0425	10.5	2500	3.3385	0.9954
3.0457	11.55	2750	3.4411	0.9954
3.0375	12.6	3000	3.4459	0.9954
3.0459	13.65	3250	3.3883	0.9954
3.0455	14.7	3500	3.3417	0.9954
3.0524	15.75	3750	3.3908	0.9954
3.0443	16.81	4000	3.3932	0.9954
3.0446	17.86	4250	3.4052	0.9954
3.0412	18.91	4500	3.3776	0.9954
3.0358	19.96	4750	3.3786	0.9954