X

Xlsr Wav2vec2 2

Developed by chrisvinsen
A fine-tuned speech recognition model based on facebook/wav2vec2-large-xlsr-53, supporting multilingual speech-to-text tasks
Downloads 20
Release Time : 5/25/2022

Model Overview

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53, focusing on speech recognition tasks, capable of converting speech to text

Model Features

Multilingual support
Based on XLSR-53 architecture, potentially supports speech recognition in multiple languages
Efficient fine-tuning
Fine-tuned on the base model, improving performance for specific tasks
Low word error rate
Achieved a word error rate (WER) of 0.4301 on the evaluation set

Model Capabilities

Speech recognition
Speech-to-text
Multilingual processing

Use Cases

Speech transcription
Meeting minutes
Automatically convert meeting recordings into text transcripts
Word error rate 0.4301
Voice notes
Convert voice memos into searchable text
Assistive technology
Real-time caption generation
Generate real-time captions for video or live streaming content
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase