Wav2vec2-large-xls-r-300m-slovenian Open-source Model - Accurately Achieve Automatic Speech Recognition of Slovenian

Wav2vec2 Large Xls R 300m Slovenian

Developed by infinitejoy

An automatic speech recognition model fine-tuned on Slovenian speech datasets based on facebook/wav2vec2-xls-r-300m

OtherOpen Source License:Apache-2.0 #Slovenian speech recognition #Low word error rate (18.97% WER)#Multi-scenario speech processing

Downloads 13

Release Time : 3/2/2022

Model Overview

This is an optimized automatic speech recognition (ASR) model for Slovenian, based on the XLS-R-300M architecture and fine-tuned on the Common Voice 7.0 dataset.

Model Features

Multilingual Pretraining

Based on the XLS-R-300M multilingual model with strong cross-language transfer capabilities

Slovenian Optimization

Specifically fine-tuned on the Common Voice Slovenian dataset for optimized performance in this language

Efficient Speech Recognition

Achieves a word error rate (WER) of 18.97% on the Common Voice test set

Model Capabilities

Slovenian speech recognition

Speech-to-text

Conversational speech processing

Use Cases

Speech Transcription

Voice Memo Transcription

Convert Slovenian voice memos into text

Performs well under clear speech conditions

Voice Assistants

Slovenian Voice Command Recognition

Used for front-end speech recognition in localized voice assistants

Performs well on standard test sets

🚀 wav2vec2-large-xls-r-300m-slovenian

This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA-FOUNDATION/COMMON_VOICE_7_0 - SL dataset, aiming to solve the problem of automatic speech recognition in Slovenian.

🚀 Quick Start

This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA-FOUNDATION/COMMON_VOICE_7_0 - SL dataset. It achieves the following results on the evaluation set:

Loss: 0.2093
Wer: 0.1907

✨ Features

Language Support: Designed for Slovenian in automatic speech recognition tasks.
Fine - Tuned: Based on the pre - trained facebook/wav2vec2-xls-r-300m model, fine - tuned on the MOZILLA - FOUNDATION/COMMON_VOICE_7_0 - SL dataset.

📦 Installation

No installation steps provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples provided in the original document, so this section is skipped.

📚 Documentation

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 7e - 05
train_batch_size: 32
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 100.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
1.785	12.5	1000	0.7465	0.6812
0.8989	25.0	2000	0.2495	0.2732
0.7118	37.5	3000	0.2126	0.2284
0.6367	50.0	4000	0.2049	0.2049
0.5763	62.5	5000	0.2116	0.2055
0.5196	75.0	6000	0.2111	0.1910
0.4949	87.5	7000	0.2131	0.1931
0.4797	100.0	8000	0.2093	0.1907

Framework versions

Transformers 4.16.0.dev0
Pytorch 1.10.1+cu102
Datasets 1.18.3
Tokenizers 0.11.0

🔧 Technical Details

No specific technical details (more than 50 words) provided in the original document, so this section is skipped.

📄 License

This model is licensed under the Apache - 2.0 license.

📊 Model Index

Property	Details
Model Type	wav2vec2 - large - xls - r - 300m - slovenian
Training Data	MOZILLA - FOUNDATION/COMMON_VOICE_7_0 - SL

Results

Task: Automatic Speech Recognition
- Dataset: Common Voice 7 (mozilla - foundation/common_voice_7_0, args: sl)
  - Metrics:
    - Test WER: 18.97
    - Test CER: 4.534
- Dataset: Robust Speech Event - Dev Data (speech - recognition - community - v2/dev_data, args: sl)
  - Metrics:
    - Test WER: 55.048
    - Test CER: 22.739
- Dataset: Robust Speech Event - Test Data (speech - recognition - community - v2/eval_data, args: sl)
  - Metrics:
    - Test WER: 54.81

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご