Open-source model wav2vec2-large-lv60_phoneme-timit_english_timit-4k

Wav2vec2 Large Lv60 Phoneme Timit English Timit 4k

Developed by excalibur12

English phoneme recognition model fine-tuned from facebook/wav2vec2-large-lv60, achieving a phoneme error rate of 10.53% on the TIMIT dataset

Speech Recognition

Transformers

EnglishOpen Source License:Apache-2.0 #Phoneme recognition #TIMIT dataset #Low PER

Downloads 306

Release Time : 6/17/2024

Model Overview

This model is optimized for English phoneme recognition tasks, particularly suitable for phoneme-level speech analysis

Model Features

Low phoneme error rate

Achieves a phoneme error rate of 10.53% on the TIMIT test set, demonstrating excellent performance

Detailed phoneme analysis

Provides detailed error analysis for various phoneme categories including vowels, stops, and fricatives

Based on wav2vec2 architecture

Utilizes facebook's advanced wav2vec2-large-lv60 model as the foundation

Model Capabilities

English phoneme recognition

Speech feature extraction

Phoneme-level error analysis

Use Cases

Speech research

Phoneme recognition research

Used for linguistic studies and speech recognition system development

10.53% phoneme error rate

Educational technology

Pronunciation assessment

Can be used for pronunciation accuracy evaluation in language learning applications

Training Loss	Epoch	Step	Validation Loss	PER
7.9352	1.04	300	3.7710	0.9617
2.7874	2.08	600	0.9080	0.1929
0.8205	3.11	900	0.4670	0.1492
0.5504	4.15	1200	0.4025	0.1408
0.4632	5.19	1500	0.3696	0.1374
0.4148	6.23	1800	0.3519	0.1343
0.3873	7.27	2100	0.3419	0.1329
0.3695	8.3	2400	0.3368	0.1317
0.3531	9.34	2700	0.3406	0.1320
0.3507	10.38	3000	0.3354	0.1315

Property	Details
Model Type	wav2vec2 - large - lv60_phoneme - timit_english_timit - 4k
Training Data	TIMIT train dataset (4620 samples)
Test Data	TIMIT test dataset (1680 samples)
License	Apache - 2.0

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Large Lv60 Phoneme Timit English Timit 4k

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 wav2vec2-large-lv60_phoneme-timit_english_timit-4k

🚀 Quick Start

✨ Features

Intended uses & limitations

Phoneme - wise errors

Vowel Phonemes

Stop Phonemes

Affricate Phonemes

Fricative Phonemes

Nasal Phonemes

Semivowels/Glide Phonemes

📚 Documentation

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

📄 License