Psst Fairseq Rir
P
Psst Fairseq Rir
Developed by birgermoell
This model is an automatic speech recognition (ASR) model fine-tuned on the Wav2vec 2.0 architecture, trained using a TIMIT subset enhanced with Room Impulse Response (RIR)
Downloads 30
Release Time : 4/15/2022
Model Overview
A speech recognition model for English phoneme recognition that performs well in noise-enhanced environments
Model Features
Noise Robustness
Trained with RIR-enhanced data, demonstrating strong robustness for speech recognition in noisy environments
Phoneme-Level Recognition
Focuses on phoneme-level speech recognition tasks rather than word or sentence recognition
Based on Wav2vec 2.0
Leverages Wav2vec 2.0's self-supervised learning capability, performing well with small-scale labeled data
Model Capabilities
English phoneme recognition
Noisy environment speech processing
Use Cases
Speech Technology Research
Phoneme Recognition Benchmarking
Can serve as a benchmark model for phoneme recognition tasks
PER: 21.8%, FER: 9.6%
Educational Technology
Pronunciation Assessment
Used for evaluating pronunciation accuracy in language learning
Featured Recommended AI Models