P

Phoneme Scorer V2 Wav2vec2

Developed by ct-vikramanantha
An automatic speech recognition model based on Wav2Vec2-Base architecture, specifically fine-tuned for phoneme recognition on the LJSpeech Phonemes dataset
Downloads 167
Release Time : 7/13/2024

Model Overview

This model is an automatic speech recognition (ASR) system focused on converting speech into phoneme sequences rather than word sequences. It uses International Phonetic Alphabet (IPA) phonemes as output units, suitable for speech processing tasks requiring phoneme-level analysis.

Model Features

Phoneme-level recognition
The model directly predicts International Phonetic Alphabet (IPA) phoneme sequences rather than traditional word sequences, making it suitable for speech processing tasks requiring phoneme analysis.
High accuracy
Achieves a phoneme error rate (PER) of 0.99% and a character error rate (CER) of 0.58% on the LJSpeech test set.
Based on Gruut phoneme set
Uses the International Phonetic Alphabet (IPA) phoneme set from the gruut project, supporting rich phoneme representation.

Model Capabilities

Speech to phoneme
Automatic speech recognition
Phoneme-level analysis

Use Cases

Speech processing
Phoneme analysis research
Used in linguistic research to analyze the phonemic composition of speech
Provides precise phoneme-level transcriptions
Speech synthesis preprocessing
Provides phoneme-level input for speech synthesis systems
Improves the accuracy and naturalness of synthesized speech
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase