X

Xtreme S Xlsr 300m Voxpopuli En

Developed by anton-l
This model is a fine-tuned speech recognition model based on facebook/wav2vec2-xls-r-300m on the GOOGLE/XTREME_S - VOXPOPULI.EN dataset, supporting English speech-to-text tasks.
Downloads 28
Release Time : 4/29/2022

Model Overview

This is a model optimized for English speech recognition tasks, fine-tuned on the VOXPOPULI.EN dataset, capable of converting English speech into text.

Model Features

Efficient Speech Recognition
Fine-tuned on the VOXPOPULI.EN dataset, optimized for English speech recognition
Based on wav2vec2-xls-r Architecture
Uses facebook's wav2vec2-xls-r-300m pre-trained model as the foundation
Multi-GPU Training Optimization
Supports distributed multi-GPU training to improve training efficiency

Model Capabilities

English Speech Recognition
Speech-to-Text
Automatic Speech Recognition

Use Cases

Speech Transcription
Automatic Meeting Transcription
Automatically converts English meeting recordings into text transcripts
Character Error Rate (CER): 0.0966, Word Error Rate (WER): 0.1549
Podcast Content Transcription
Automatically converts English podcast content into text transcripts
Assistive Technology
Real-time Caption Generation
Generates real-time captions for English video content
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase