P

Project NLP

Developed by zakria
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate (WER) of 0.3355 on the evaluation set.
Downloads 22
Release Time : 6/18/2022

Model Overview

This model is a speech recognition model based on the wav2vec2 architecture, suitable for tasks converting speech to text.

Model Features

Low Word Error Rate
Achieved a word error rate (WER) of 0.3355 on the evaluation set, demonstrating good performance.
Based on wav2vec2 Architecture
Uses facebook's wav2vec2-base model as the foundational architecture, featuring excellent speech feature extraction capabilities.
Linear Learning Rate Scheduling
Employs linear learning rate scheduling and warm-up strategies during training to optimize training effectiveness.

Model Capabilities

Speech Recognition
Audio-to-Text

Use Cases

Speech Transcription
Meeting Minutes
Automatically convert meeting recordings into text transcripts
Word error rate 0.3355
Voice Notes
Convert voice memos into searchable text
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase