Psst Fairseq Larger Rir
P
Psst Fairseq Larger Rir
Developed by birgermoell
This model is an automatic speech recognition (ASR) model based on the Wav2vec 2.0 architecture, fine-tuned using a subset of the TIMIT dataset enhanced with room impulse responses (RIR).
Downloads 30
Release Time : 4/15/2022
Model Overview
A speech recognition model optimized for phoneme recognition tasks, suitable for speech processing in noisy environments
Model Features
RIR-enhanced Training Data
Uses the TIMIT dataset enhanced with room impulse responses, improving the model's robustness in real-world environments
Wav2vec 2.0 Foundation
Fine-tuned based on the powerful Wav2vec 2.0 architecture, inheriting its excellent speech feature extraction capabilities
Phoneme-level Recognition
Focuses on phoneme-level speech recognition tasks, suitable for applications requiring detailed speech analysis
Model Capabilities
English Speech Recognition
Phoneme-level Analysis
Noisy Environment Speech Processing
Use Cases
Speech Technology Research
Phoneme Recognition Benchmark
Can serve as a benchmark model for phoneme recognition tasks in comparative studies
PER: 21.0%, FER: 9.2%
Speech Enhancement Applications
Speech Recognition in Noisy Environments
Suitable for speech recognition in environments with echoes and noise, such as conference rooms and public spaces
Featured Recommended AI Models
Š 2025AIbase