Wav2vec2-large-xls-r-300m-breton-cv8 Open-source Model - Empowering Breton Speech Recognition

Wav2vec2 Large Xls R 300m Breton Cv8

Developed by infinitejoy

This is an automatic speech recognition model fine-tuned on Breton language dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers

OtherOpen Source License:Apache-2.0 #Breton speech recognition #Low-resource language processing #Multilingual speech models

Downloads 17

Release Time : 3/2/2022

Model Overview

This model is specifically designed for automatic speech recognition tasks in Breton, fine-tuned on the Common Voice 8 dataset

Model Features

Breton language support

Speech recognition model specifically optimized for Breton

Based on XLS-R architecture

Uses the powerful wav2vec2-xls-r-300m as the base model

Trained on Common Voice dataset

Fine-tuned on Mozilla Common Voice 8's Breton dataset

Model Capabilities

Breton speech recognition

Speech-to-text

Use Cases

Speech transcription

Breton speech transcription

Convert Breton speech to text

Test WER 54.855, Test CER 17.865

Voice assistants

Breton voice assistant

Supports voice interaction applications in Breton

🚀 XLS-R-300M - Breton

This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA - FOUNDATION/COMMON_VOICE_8_0 - BR dataset. It is designed for automatic speech recognition tasks and has achieved certain results in relevant evaluations.

📚 Documentation

Model Information

Property	Details
Model Type	Automatic Speech Recognition
Training Data	mozilla - foundation/common_voice_8_0
Model Name	XLS - R - 300M - Breton

Evaluation Results

This model achieves the following results on the evaluation set:

Loss: NA
Wer: NA

In the model index, for the Automatic Speech Recognition task on the Common Voice 8 dataset (type: mozilla - foundation/common_voice_8_0, args: br), the metrics are as follows:

Test WER: 54.855
Test CER: 17.865

Framework Versions

Transformers 4.16.0.dev0
Pytorch 1.10.0+cu102
Datasets 1.17.1.dev0
Tokenizers 0.10.3

💻 Usage Examples

Evaluation Commands

Evaluate on `mozilla - foundation/common_voice_8_0` with split `test`

python eval.py --model_id infinitejoy/wav2vec2-large-xls-r-300m-breton-cv8 --dataset mozilla-foundation/common_voice_8_0 --config br --split test

Evaluate on `speech - recognition - community - v2/dev_data`

python eval.py --model_id infinitejoy/wav2vec2-large-xls-r-300m-breton-cv8 --dataset speech-recognition-community-v2/dev_data --config br --split validation --chunk_length_s 5.0 --stride_length_s 1.0

Inference With LM

import torch
from datasets import load_dataset
from transformers import AutoModelForCTC, AutoProcessor
import torchaudio.functional as F


model_id = "infinitejoy/wav2vec2-large-xls-r-300m-breton-cv8"

sample_iter = iter(load_dataset("mozilla-foundation/common_voice_8_0", "br", split="test", streaming=True, use_auth_token=True))

sample = next(sample_iter)
resampled_audio = F.resample(torch.tensor(sample["audio"]["array"]), 48_000, 16_000).numpy()

model = AutoModelForCTC.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id)

input_values = processor(resampled_audio, return_tensors="pt").input_values

with torch.no_grad():
    logits = model(input_values).logits

transcription = processor.batch_decode(logits.numpy()).text

Eval results on Common Voice 7 "test" (WER):

📄 License

This model is released under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご