S

Stt Be Fastconformer Hybrid Large Pc

Developed by nvidia
This is a large-scale Belarusian automatic speech recognition model based on the FastConformer architecture, combining Transformer and CTC decoder loss, trained on 1,500 hours of Belarusian speech data.
Downloads 33
Release Time : 5/19/2023

Model Overview

This model is used to transcribe speech containing uppercase and lowercase Belarusian letters, spaces, and basic punctuation marks, supporting 16kHz mono audio input.

Model Features

Hybrid training architecture
Simultaneously trained using Transformer and CTC decoder loss, combining the advantages of both methods
Efficient processing
Utilizes the FastConformer architecture with 8x depthwise separable convolution downsampling for optimized processing speed
High accuracy
Achieves a WER of 2.72% (excluding punctuation) on the Common Voice 12.0 Belarusian test set

Model Capabilities

Belarusian speech recognition
Audio transcription
Punctuation prediction

Use Cases

Speech transcription
Speech-to-text
Convert Belarusian speech content into text
Accuracy up to 97.28% (excluding punctuation)
Voice assistants
Voice command recognition
Used for command recognition in Belarusian voice assistant systems
Featured Recommended AI Models
ยฉ 2025AIbase