W

Wav2vec2 Base 960h Finetuned Common Voice

Developed by obokkkk
This model is a speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-base-960h
Downloads 16
Release Time : 4/25/2022

Model Overview

wav2vec2-base-960h-finetuned_common_voice is an automatic speech recognition (ASR) model based on the wav2vec2 architecture, pre-trained on 960 hours of LibriSpeech dataset and fine-tuned on the Common Voice dataset.

Model Features

Based on wav2vec2 Architecture
Utilizes the advanced wav2vec2 self-supervised learning architecture to effectively learn speech representations
Common Voice Fine-tuning
Fine-tuned on the Common Voice dataset, enhancing the model's generalization capability
Efficient Training
Uses techniques like mixed-precision training and gradient accumulation to improve training efficiency

Model Capabilities

Speech Recognition
Audio to Text
English Speech Processing

Use Cases

Speech Transcription
Meeting Minutes
Automatically convert meeting recordings into text transcripts
Subtitle Generation
Automatically generate English subtitles for video content
Voice Assistant
Voice Command Recognition
Recognize and understand user voice commands
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase