W

Wav2vec2 Xlsr 300m Finnish Lm

Developed by Finnish-NLP
A Finnish automatic speech recognition model fine-tuned based on facebook/wav2vec2-xls-r-300m, trained with 275.6 hours of Finnish annotated data, supports use with KenLM language model.
Downloads 28.39k
Release Time : 3/28/2022

Model Overview

An automatic speech recognition model optimized for Finnish, suitable for tasks converting Finnish speech to text.

Model Features

Multi-source training data
Incorporates 275.6 hours of Finnish data from Common Voice, parliamentary recordings, broadcast corpora, etc., covering various speech scenarios.
Language model enhancement
Includes a Finnish KenLM 5-gram language model trained on audio texts and Wikipedia, improving recognition accuracy.
Efficient training
Utilizes 8-bit Adam optimizer and mixed-precision training, fine-tuned on V100 GPU.

Model Capabilities

Finnish speech recognition
Long audio chunk processing
Domain adaptation (requires fine-tuning)

Use Cases

Speech transcription
Parliament recording transcription
Suitable for automatic text transcription of Finnish parliamentary recordings
WER 8.16% on parliament data-dominated test set
Broadcast content subtitle generation
Automatically generates subtitles for Finnish broadcast programs
CER 1.97% on broadcast corpus test set
EdTech
Language learning assistance
Used for pronunciation assessment and text feedback for Finnish language learners
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase