Test Open-Source Speech Recognition Model - Free Deployment, Word Error Rate in Evaluation Set Only 21.61%

Test

Developed by GleamEyeBeast

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base-960h, achieving a word error rate of 21.61% on the evaluation set.

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech Recognition #Fine-tuned Model #Low Word Error Rate

Downloads 21

Release Time : 3/2/2022

Model Overview

This is a fine-tuned model for speech recognition tasks, based on Facebook's wav2vec2 architecture, suitable for English speech-to-text tasks.

Model Features

Efficient Fine-tuning

Fine-tuned based on the pre-trained wav2vec2 model, enabling rapid adaptation with limited data

Low Word Error Rate

Achieved a word error rate of 21.61% on the evaluation set, demonstrating good performance

Lightweight

Based on the wav2vec2-base architecture, requiring lower computational resources compared to larger models

Model Capabilities

English Speech Recognition

Audio-to-Text Conversion

Speech Transcription

Use Cases

Speech Transcription

Meeting Minutes

Automatically transcribe English meeting recordings into text records

Accuracy approximately 78.39% (based on 21.61% word error rate)

Voice Notes

Convert English voice notes into searchable text

Training Loss	Epoch	Step	Validation Loss	Wer
5.5828	4.0	500	3.0263	1.0
1.8657	8.0	1000	0.2213	0.2650
0.332	12.0	1500	0.2095	0.2413
0.2037	16.0	2000	0.1906	0.2222
0.1282	20.0	2500	0.1761	0.2161

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Test

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Test

🚀 Quick Start

📚 Documentation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

📄 License