Wav2vec2-base toy train data masked audio 10ms open-source speech recognition model

Wav2vec2 Base Toy Train Data Masked Audio 10ms

Developed by scasutt

A speech recognition model fine-tuned based on facebook/wav2vec2-base, trained on 10ms masked audio tasks

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech recognition fine-tuning #10ms audio processing #Low-resource training

Downloads 22

Release Time : 3/26/2022

Model Overview

This model is a fine-tuned version of wav2vec2-base, focusing on processing masked audio data, suitable for speech recognition tasks.

Model Features

10ms masked audio processing

Specially optimized for training on masked audio data with 10ms intervals

Fine-tuned based on wav2vec2-base

Targeted optimization based on the mature wav2vec2-base architecture

Model Capabilities

Speech recognition

Masked audio processing

Use Cases

Speech processing

Incomplete audio recognition

Recognizing speech content that is partially masked or missing

WER 0.7145

🚀 wav2vec2-base_toy_train_data_masked_audio_10ms

This is a fine - tuned model based on facebook/wav2vec2-base. It is trained on the None dataset and achieves the following results on the evaluation set:

Loss: 1.2477
Wer: 0.7145

🚀 Quick Start

This model is a fine - tuned version of facebook/wav2vec2-base on the None dataset. It has shown certain performance on the evaluation set.

📚 Documentation

Model Information

Property	Details
Model Name	wav2vec2-base_toy_train_data_masked_audio_10ms
Base Model	facebook/wav2vec2-base
Training Dataset	None
Evaluation Results	Loss: 1.2477, Wer: 0.7145

Training Procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
3.1337	1.05	250	3.4081	0.9982
3.0792	2.1	500	3.2446	0.9982
2.0577	3.15	750	1.5839	0.9492
1.3639	4.2	1000	1.3279	0.8798
1.0814	5.25	1250	1.1629	0.8294
0.8722	6.3	1500	1.1305	0.8140
0.7602	7.35	1750	1.1241	0.7972
0.6982	8.4	2000	1.1429	0.7780
0.6494	9.45	2250	1.1047	0.7620
0.5924	10.5	2500	1.1756	0.7649
0.5385	11.55	2750	1.2230	0.7736
0.5026	12.6	3000	1.1783	0.7472
0.4973	13.65	3250	1.1613	0.7287
0.4726	14.7	3500	1.1923	0.7345
0.4521	15.75	3750	1.2153	0.7171
0.4552	16.81	4000	1.2485	0.7226
0.422	17.86	4250	1.2664	0.7240
0.3708	18.91	4500	1.2352	0.7148
0.3516	19.96	4750	1.2477	0.7145