xls-r-300-sv-cv7 Open-Source Automatic Speech Recognition Model - Accurately Identify Swedish Speech Content

Xls R 300 Sv Cv7

Developed by patrickvonplaten

This is an automatic speech recognition model fine-tuned on the Swedish Common Voice 7.0 dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers

OtherOpen Source License:Apache-2.0 #Swedish speech recognition #Multi-scenario adaptation #Low word error rate

Downloads 19

Release Time : 3/2/2022

Model Overview

This model is specifically designed for automatic speech recognition tasks in Swedish and performs excellently on the Common Voice 7.0 dataset

Model Features

High-performance Swedish recognition

Achieves a word error rate (WER) of 15.99% on the Common Voice 7.0 test set

Multi-dataset validation

Validated not only on Common Voice but also on robust speech event datasets

Based on XLS-R architecture

Uses facebook's wav2vec2-xls-r-300m as the base model

Model Capabilities

Swedish speech recognition

Long audio processing (supports chunk processing)

Use Cases

Speech-to-text

Swedish speech transcription

Convert Swedish speech content into text

WER 15.99% on Common Voice test set

Speech analysis

Speech event detection

Identify and analyze specific events in speech

WER 24.41% on robust speech event dataset

🚀 XLS-R-300M - Swedish - CV7 - v2

This is a fine - tuned model for automatic speech recognition on the Swedish language, based on the facebook/wav2vec2 - xls - r - 300m model, achieving good results on the evaluation set.

🚀 Quick Start

This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA - FOUNDATION/COMMON_VOICE_7_0 - SV - SE dataset. It achieves the following results on the evaluation set:

Loss: 0.2604
Wer: 0.2334

✨ Features

Automatic Speech Recognition: Specialized for Swedish speech recognition tasks.
Fine - Tuned: Based on a pre - trained model and fine - tuned on specific Swedish datasets.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples are provided in the original document, so this section is skipped.

📚 Documentation

Model Information

Property	Details
Model Type	XLS - R - 300M - Swedish - CV7 - v2
Training Data	MOZILLA - FOUNDATION/COMMON_VOICE_7_0 - SV - SE
Results on Evaluation Set	Loss: 0.2604; Wer: 0.2334

Model Index

Name: XLS - R - 300M - Swedish - CV7 - v2
Results:
- Task 1:
  - Task Name: Automatic Speech Recognition
  - Dataset: Common Voice 7 (mozilla - foundation/common_voice_7_0, args: sv - SE)
  - Metrics:
    - Test WER: 15.99
    - Test CER: 5.2
- Task 2:
  - Task Name: Automatic Speech Recognition
  - Dataset: Robust Speech Event - Dev Data (speech - recognition - community - v2/dev_data, args: sv)
  - Metrics:
    - Test WER: 24.41
    - Test CER: 11.88

Training Procedure

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 7.5e - 05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi - GPU
num_devices: 8
gradient_accumulation_steps: 1
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2000
num_epochs: 50.0
mixed_precision_training: Native AMP

Training Results

See Tensorboard

Evaluation Commands

To evaluate on mozilla - foundation/common_voice_7_0 with split test

python eval.py --model_id patrickvonplaten/xls - r - 300 - sv - cv7 --dataset mozilla - foundation/common_voice_7_0 --config sv - SE --split test

To evaluate on speech - recognition - community - v2/dev_data

python eval.py --model_id patrickvonplaten/xls - r - 300 - sv - cv7 --dataset speech - recognition - community - v2/dev_data --config sv --split validation --chunk_length_s 5.0 --stride_length_s 1.0

Framework Versions

Transformers 4.17.0.dev0
Pytorch 1.9.0+cu111
Datasets 1.18.4.dev0
Tokenizers 0.10.3

📄 License

This model is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご