wav2vec2-base_toy_train_data_slow_10pct Open-source Speech Recognition Model - Accurately Identify Speech Content with Low Error Rate

Wav2vec2 Base Toy Train Data Slow 10pct

Developed by scasutt

A speech recognition model fine-tuned on an unknown dataset based on facebook/wav2vec2-base, with a Word Error Rate (WER) of 0.7175

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech Recognition #Low-resource Fine-tuning #Wav2Vec2 Architecture

Downloads 22

Release Time : 3/27/2022

Model Overview

This model is a fine-tuned version of wav2vec2-base, primarily used for speech recognition tasks. The model demonstrates certain recognition capabilities on the evaluation set but still has room for improvement.

Model Features

Fine-tuned based on wav2vec2-base

Fine-tuned on the base wav2vec2 model to adapt to specific speech recognition tasks

Linear Learning Rate Scheduling

Adopts a linear learning rate scheduling strategy with a 1000-step warm-up period

Gradient Accumulation Training

Uses gradient accumulation (steps=2) to increase effective batch size

Model Capabilities

Speech-to-Text

Automatic Speech Recognition

Use Cases

Speech Transcription

Meeting Minutes Transcription

Convert meeting recordings into text transcripts

Word Error Rate 0.7175

Voice Command Recognition

Recognize simple voice commands

Training Loss	Epoch	Step	Validation Loss	Wer
3.0663	2.1	500	3.0725	0.9982
1.1679	4.2	1000	1.3620	0.8889
0.6789	6.3	1500	1.2182	0.8160
0.5764	8.4	2000	1.2469	0.7667
0.4603	10.5	2500	1.2851	0.7533
0.4085	12.6	3000	1.2351	0.7401
0.3583	14.7	3500	1.2455	0.7367
0.3158	16.81	4000	1.3663	0.7261
0.2817	18.91	4500	1.3248	0.7175

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Base Toy Train Data Slow 10pct

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 wav2vec2-base_toy_train_data_slow_10pct

🚀 Quick Start

📚 Documentation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

📄 License