P

Psst Fairseq Rir

Developed by birgermoell
This model is an automatic speech recognition (ASR) model fine-tuned on the Wav2vec 2.0 architecture, trained using a TIMIT subset enhanced with Room Impulse Response (RIR)
Downloads 30
Release Time : 4/15/2022

Model Overview

A speech recognition model for English phoneme recognition that performs well in noise-enhanced environments

Model Features

Noise Robustness
Trained with RIR-enhanced data, demonstrating strong robustness for speech recognition in noisy environments
Phoneme-Level Recognition
Focuses on phoneme-level speech recognition tasks rather than word or sentence recognition
Based on Wav2vec 2.0
Leverages Wav2vec 2.0's self-supervised learning capability, performing well with small-scale labeled data

Model Capabilities

English phoneme recognition
Noisy environment speech processing

Use Cases

Speech Technology Research
Phoneme Recognition Benchmarking
Can serve as a benchmark model for phoneme recognition tasks
PER: 21.8%, FER: 9.6%
Educational Technology
Pronunciation Assessment
Used for evaluating pronunciation accuracy in language learning
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase