Model Selection

High-precision WER

# High-precision WER

Wav2vec2 Large Xlrs Korean V5

This model is a Korean automatic speech recognition model fine-tuned on the zeroth_korean dataset based on facebook/wav2vec2-xls-r-300m, with a word error rate of 0.2433.

Speech Recognition

Wav2vec2 Large Xlsr 53 Icelandic Ep30 967h

An acoustic model fine-tuned specifically for Icelandic automatic speech recognition tasks, trained on 967 hours of Icelandic data

Speech Recognition

Transformers Other

language-and-voice-lab

Stt Ru Fastconformer Hybrid Large Pc

This is a FastConformer hybrid model for Russian automatic speech recognition, combining Transducer and CTC decoders with approximately 115 million parameters.

Speech Recognition Other

Stt De Fastconformer Hybrid Large Pc

This is a German automatic speech recognition model based on the FastConformer architecture, employing a hybrid training approach with Transformer and CTC, with a parameter size of approximately 115M.

Speech Recognition German

Wav2vec2 Large Xlsr 53 Spanish Ep5 944h

An acoustic model for Spanish automatic speech recognition, fine-tuned for 5 epochs based on facebook/wav2vec2-large-xlsr-53 using approximately 944 hours of Spanish data.

Speech Recognition

Transformers Spanish

carlosdanielhernandezmena

Wav2vec2 Large Vi Vlsp2020

Vietnamese automatic speech recognition model based on wav2vec2 architecture, pre-trained with 13,000 hours of unlabeled YouTube audio and fine-tuned on 250 hours of labeled data

Speech Recognition

Transformers Other

Stt Ru Conformer Ctc Large

This is a large Conformer-CTC model for Russian automatic speech recognition, trained on approximately 1,636 hours of Russian speech data with about 120 million parameters.

Speech Recognition Other

Stt Es Conformer Ctc Large

This is a large Conformer-CTC model for Spanish automatic speech recognition (ASR), trained and released by NVIDIA.

Speech Recognition Spanish

Stt Fr Conformer Transducer Large

This is a large-scale Conformer-Transducer model for French automatic speech recognition, with approximately 120 million parameters, trained on over 1,500 hours of French speech data.

Speech Recognition French

Stt Fr Conformer Ctc Large

This is a large French automatic speech recognition (ASR) model based on the Conformer architecture, trained using CTC loss function on over 1,500 hours of French speech data.

Speech Recognition French

Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53

This model is an automatic speech recognition model fine-tuned on the GARY109/AI_LIGHT_DANCE - ONSET-STEPMANIA2 dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Wav2vec2 Large Multilang Cv Ru

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the common_voice dataset, primarily designed for Russian speech recognition tasks.

Speech Recognition

Assignment1 Joane

A speech-to-text (S2T) model for automatic speech recognition (ASR)

Speech Recognition

Transformers English

Classroom-workshop

Assignment1 Jack

A speech-to-text (S2T) model for automatic speech recognition (ASR), based on a sequence-to-sequence transformer architecture

Speech Recognition

Transformers English

Classroom-workshop

Assignment1 Omar

Wav2Vec2 is a self-supervised learning-based speech recognition model, pre-trained and fine-tuned on 960 hours of LibriSpeech audio data, supporting English speech transcription.

Speech Recognition

Transformers English

Classroom-workshop

Wav2vec2 Large Xls R 300m Singlish Colab

A speech recognition model fine-tuned on the Singapore English (li_singlish) dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Ai Light Dance Singing Ft Wav2vec2 Large Lv60 V2

This model is an automatic speech recognition model fine-tuned on the ONSET-SINGING dataset based on wav2vec2-large-lv60, focusing on singing voice recognition tasks.

Speech Recognition

Dansk Wav2vec21

This model is a Danish speech recognition model fine-tuned by Siyam/SKYLy on the common_voice dataset

Speech Recognition

English Filipino Wav2vec2 L Xls R Test 02

This is a speech recognition model fine-tuned on Filipino speech datasets based on the wav2vec2-large-xlsr-53-english model, supporting English and Filipino speech-to-text tasks.

Speech Recognition

Wav2vec2 Common Voice Lithuanian

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the COMMON_VOICE - LT dataset for Lithuanian speech recognition.

Speech Recognition

Transformers Other

20220413 210552

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-xls-r-300m on the common_voice dataset

Speech Recognition

Aradia Ctc Distilhubert Ft

An automatic speech recognition (ASR) model fine-tuned on Arabic speech datasets based on distilhubert

Speech Recognition

Wav2vec2 Large Xls R 300m Irish Colab Test

This is a speech recognition model fine-tuned on the Common Voice Irish dataset based on the facebook/wav2vec2-xls-r-300m model, primarily used for automatic speech recognition tasks in Irish.

Speech Recognition

Wav2vec2 Xls R 1b Portuguese CORAA 3

Portuguese automatic speech recognition model fine-tuned on the CORAA dataset based on facebook/wav2vec2-xls-r-1b

Speech Recognition

Transformers Other

Wav2vec2 Large Xls R 300m Odia Cv8

An automatic speech recognition model fine-tuned on the Odia (OR) Common Voice dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers Other

Wav2vec2 Large Xls R 300m Ur

Urdu speech recognition model based on the wav2vec2-large-xls-r-300m architecture, fine-tuned on the Common Voice dataset

Speech Recognition

S2t Small Librispeech Asr

A speech-to-text (S2T) model for automatic speech recognition (ASR), based on a sequence-to-sequence transformer architecture

Speech Recognition

Transformers English

Wav2vec2 Xls R 1b German

This is a German automatic speech recognition model based on the XLS-R 1B architecture, fine-tuned on multiple German speech datasets including Common Voice 8.0

Speech Recognition

Transformers German

Wav2vec2 Large Xlsr 53 Ir

An Irish Gaelic automatic speech recognition model fine-tuned on wav2vec2-large-xlsr-53, trained on the Common Voice 7.0 dataset

Speech Recognition

Wav2vec2 Xls R 1b Italian

This is an Italian automatic speech recognition model based on the XLS-R 1B architecture, fine-tuned on multiple Italian datasets

Speech Recognition

Transformers Other

Wav2vec2 Speechdat

This model is a Swedish automatic speech recognition model fine-tuned on the COMMON_VOICE - SV-SE dataset based on facebook/wav2vec2-large-xlsr-53.

Speech Recognition

Wav2vec2 Xlsr Basaa

This model is an automatic speech recognition model fine-tuned on the Common Voice 8 Basaa dataset based on facebook/wav2vec2-xls-r-1b.

Speech Recognition

Transformers Other

Wav2vec2 Base Turkish Cv7

Turkish automatic speech recognition model based on wav2vec2 architecture, fine-tuned on the Common Voice 7.0 Turkish dataset

Speech Recognition

Transformers Other

Wav2vec2 Xls R 1b Hi Cv8

This is an automatic speech recognition model fine-tuned on the Common Voice 8.0 Hindi dataset based on the facebook/wav2vec2-xls-r-1b model, supporting Hindi speech-to-text tasks.

Speech Recognition

Transformers Other

Wav2vec2 Xls R 1b Russian

Russian speech recognition model fine-tuned based on XLS-R 1B architecture, trained on datasets like Common Voice 8.0

Speech Recognition

Transformers Other

Wav2vec2 Large Xls R 300m Galician

This is an automatic speech recognition model fine-tuned on Galician speech datasets based on facebook/wav2vec2-xls-r-300m.

Speech Recognition

Transformers Other

Wav2vec2 Xlsr Czech

This model is a Czech automatic speech recognition model fine-tuned on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - cs dataset based on facebook/wav2vec2-xls-r-1b.

Speech Recognition

Transformers Other

Wav2vec2 Xls R 1b Portuguese

This is a Portuguese automatic speech recognition model based on the XLS-R 1B architecture, fine-tuned on multiple Portuguese speech datasets.

Speech Recognition

Transformers Other

S2t Large Librispeech Asr

An end-to-end sequence-to-sequence transformer model for automatic speech recognition (ASR), trained on the LibriSpeech dataset

Speech Recognition

Transformers English

Wav2vec2 Xl 960h Dementiabank

This model is a speech recognition model fine-tuned on the DementiaBank dataset based on facebook/wav2vec2-large-960h, primarily used for speech-to-text tasks.

Speech Recognition

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase