alphaDelay Open-source Speech Recognition Model - Fine-tuned based on wav2vec2, Accurately Recognize Speech with Low Error Rate

Home

Alphadelay

Developed by renBaikau

A speech recognition model fine-tuned based on facebook/wav2vec2-base, with a word error rate (WER) of 1.0

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech Recognition Fine-tuning #Wav2Vec2 Optimization #Low Word Error Rate

Downloads 17

Release Time : 3/2/2022

Model Overview

This model is a fine-tuned speech recognition (ASR) model based on the facebook/wav2vec2-base architecture, suitable for tasks converting speech to text.

Model Features

Based on wav2vec2 architecture

Utilizes the proven wav2vec2-base architecture with excellent speech feature extraction capabilities

Fine-tuning optimization

Underwent 15 rounds of fine-tuning on the base model to optimize performance in specific scenarios

Model Capabilities

Speech-to-text

Automatic Speech Recognition

Use Cases

Speech Transcription

Meeting Minutes

Automatically convert meeting recordings into text transcripts

Voice Notes

Convert voice memos into searchable text

Training Loss	Epoch	Step	Validation Loss	Wer
82.3335	5.0	25	14.0648	1.0
6.1049	10.0	50	3.7145	1.0
3.9873	15.0	75	3.6648	1.0

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Alphadelay

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 alphaDelay

🚀 Quick Start

📚 Documentation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

📄 License