Project_NLP Open-source Speech Recognition Model - Precise Recognition, Low Word Error Rate, Free Deployment

Project NLP

Developed by zakria

A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate (WER) of 0.3355 on the evaluation set.

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech Recognition #Low Word Error Rate #Wav2Vec2 Fine-tuning

Downloads 22

Release Time : 6/18/2022

Model Overview

This model is a speech recognition model based on the wav2vec2 architecture, suitable for tasks converting speech to text.

Model Features

Low Word Error Rate

Achieved a word error rate (WER) of 0.3355 on the evaluation set, demonstrating good performance.

Based on wav2vec2 Architecture

Uses facebook's wav2vec2-base model as the foundational architecture, featuring excellent speech feature extraction capabilities.

Linear Learning Rate Scheduling

Employs linear learning rate scheduling and warm-up strategies during training to optimize training effectiveness.

Model Capabilities

Speech Recognition

Audio-to-Text

Use Cases

Speech Transcription

Meeting Minutes

Automatically convert meeting recordings into text transcripts

Word error rate 0.3355

Voice Notes

Convert voice memos into searchable text

🚀 Project_NLP

This Project_NLP model is a fine - tuned version of [facebook/wav2vec2 - base](https://huggingface.co/facebook/wav2vec2 - base) on the None dataset. It offers evaluation results with a Loss of 0.5324 and a Wer of 0.3355, which can be useful for relevant natural language processing tasks.

🚀 Quick Start

This model is a fine - tuned version of [facebook/wav2vec2 - base](https://huggingface.co/facebook/wav2vec2 - base) on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.5324
Wer: 0.3355

🔧 Technical Details

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
3.5697	1.0	500	2.1035	0.9979
0.8932	2.01	1000	0.5649	0.5621
0.4363	3.01	1500	0.4326	0.4612
0.3035	4.02	2000	0.4120	0.4191
0.2343	5.02	2500	0.4199	0.3985
0.1921	6.02	3000	0.4380	0.4043
0.1549	7.03	3500	0.4456	0.3925
0.1385	8.03	4000	0.4264	0.3871
0.1217	9.04	4500	0.4744	0.3774
0.1041	10.04	5000	0.4498	0.3745
0.0968	11.04	5500	0.4716	0.3628
0.0893	12.05	6000	0.4680	0.3764
0.078	13.05	6500	0.5100	0.3623
0.0704	14.06	7000	0.4893	0.3552
0.0659	15.06	7500	0.4956	0.3565
0.0578	16.06	8000	0.5450	0.3595
0.0563	17.07	8500	0.4891	0.3614
0.0557	18.07	9000	0.5307	0.3548
0.0447	19.08	9500	0.4923	0.3493
0.0456	20.08	10000	0.5156	0.3479
0.0407	21.08	10500	0.4979	0.3389
0.0354	22.09	11000	0.5549	0.3462
0.0322	23.09	11500	0.5601	0.3439
0.0342	24.1	12000	0.5131	0.3451
0.0276	25.1	12500	0.5206	0.3392
0.0245	26.1	13000	0.5337	0.3373
0.0226	27.11	13500	0.5311	0.3353
0.0229	28.11	14000	0.5375	0.3373
0.0225	29.12	14500	0.5324	0.3355

Framework versions

Transformers 4.17.0
Pytorch 1.11.0+cu113
Datasets 1.18.3
Tokenizers 0.12.1

📄 License

This project is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご