PSST-Base-Rep Open-Source Speech Recognition Model - Free Deployment for Precise Speech Recognition

Home

Psst Base Rep

Developed by birgermoell

A baseline speech recognition model trained on the PSST dataset based on the Wav2vec2-small architecture

Speech Recognition

Transformers

#Low FER Speech Recognition #Phoneme-Level Error Detection #Academic Scenario Applicable

Downloads 30

Release Time : 4/1/2022

Model Overview

This model is a reproduction of the Wav2vec2-small architecture on the PSST dataset, primarily used for speech recognition tasks, supporting phoneme and character-level recognition.

Model Features

Efficient Speech Recognition

Based on the Wav2vec2-small architecture, providing efficient speech recognition capabilities.

Phoneme and Character-Level Recognition

Supports evaluation of Phoneme Error Rate (PER) and Word Error Rate (WER).

Model Capabilities

Speech Recognition

Phoneme Recognition

Character-Level Recognition

Use Cases

Speech Transcription

Speech-to-Text

Convert speech content into text, suitable for meeting minutes, voice notes, and other scenarios.

Word Error Rate (WER): 10.4%

Speech Analysis

Phoneme Analysis

Analyze the phoneme composition in speech, suitable for linguistic research or speech training.

Phoneme Error Rate (PER): 23.1%

Property	Details
FER	10.4%
PER	23.1%

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Psst Base Rep

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Model Reproduction