U

Unispeech 1350 En 90 It Ft 1h

Developed by microsoft
UniSpeech is a unified speech representation learning model that combines supervised phoneme CTC learning and self-supervised learning, specifically fine-tuned for Italian.
Downloads 19
Release Time : 3/2/2022

Model Overview

This model is pre-trained on 16kHz sampled speech audio with phoneme labels and fine-tuned on 1 hour of Italian phoneme data, suitable for phoneme classification tasks.

Model Features

Multi-task learning
Simultaneously performs supervised phoneme CTC learning and phoneme-aware contrastive self-supervised learning
Cross-lingual generalization
The generated representations better capture phoneme structure-related information, improving cross-lingual and cross-domain generalization
Efficient fine-tuning
Only requires 1 hour of Italian phoneme data for fine-tuning

Model Capabilities

Speech recognition
Phoneme classification
Cross-lingual speech representation learning

Use Cases

Speech recognition
Italian phoneme recognition
Convert Italian speech into phoneme sequences
Phoneme error rate of 6.69%
Speech technology research
Cross-lingual speech representation research
Study the transferability of speech representations across languages
Compared to self-supervised pre-training and supervised transfer learning, it can reduce relative phoneme error rates by up to 13.4% and 17.8% respectively
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Ā© 2025AIbase