X

Xlsr Timit B0

Developed by KoelLabs
A phoneme transcription model fine-tuned on the TIMIT dataset, capable of transcribing English audio into phoneme representations
Downloads 40
Release Time : 11/30/2024

Model Overview

This model is based on the pre-trained checkpoint ginic/data_seed_4_wav2vec2-large-xlsr-buckeye-ipa and fine-tuned using the DARPA TIMIT English corpus. It can transcribe English audio into phoneme representations and outperforms all current XLSR models in English phonetic transcription tasks.

Model Features

High-precision phoneme transcription
Achieves an average character error rate (CER) of 0.113 on the TIMIT test set
English optimization
Specifically optimized for English speech with high phoneme transcription accuracy
Based on XLSR architecture
Built on the powerful wav2vec2-large-xlsr architecture with excellent speech feature extraction capabilities

Model Capabilities

English speech recognition
Phoneme transcription
Automatic speech transcription

Use Cases

Phonetics research
Phoneme analysis
Used for phoneme feature analysis in phonetics research
Provides accurate phoneme transcription results
Speech technology development
Speech recognition system development
Serves as a phoneme transcription component for speech recognition systems
Improves system accuracy in recognizing English phonemes
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase