Model Selection

Speech-to-text

# Speech-to-text

Whisper Finetuned Amharic

Amharic speech recognition model fine-tuned from openai/whisper-small, achieving a word error rate of 2.0538% on the evaluation set

Speech Recognition

Wav2vec2 Large Xls R 300m Ru

This model is a Russian automatic speech recognition (ASR) model fine-tuned on the common_voice_17_0 dataset based on facebook/wav2vec2-xls-r-300m, with a word error rate (WER) of 0.195.

Speech Recognition

Whisper Small Sinhala

A Sinhala speech recognition model fine-tuned based on OpenAI Whisper-small

Speech Recognition

Transformers Other

Lingalingeswaran

Whisper Hindi2Hinglish Swift

A Hindi-Hinglish mixed speech recognition model optimized based on the Whisper architecture, specifically designed for Indian accents and noisy environments

Speech Recognition

Transformers Supports Multiple Languages

Whisper Large V3 Turbo Arabic

Based on the transformers library, this is a fine-tuned version of openai/whisper-large-v3-turbo on the common_voice_11_0 dataset, optimized specifically for Arabic speech recognition.

Speech Recognition

Distil Whisper Large V3

This model is a conversion from the GGML format of distil-whisper/distil-large-v3-ggml to Ratchet's custom format, primarily used for speech recognition tasks.

Speech Recognition

Language Detector

A language detection model fine-tuned based on openai/whisper-small, achieving 96.47% accuracy on the evaluation set

Speech Recognition

Whisper Large V3 Ft Cv16 Mn

A speech recognition model fine-tuned on the Common Voice 16.0 dataset based on OpenAI Whisper Large V3

Speech Recognition

Whisper Large V2 Spanish

A speech recognition model fine-tuned on the Common Voice 13.0 Spanish dataset based on OpenAI Whisper-large-v2

Speech Recognition

An automatic speech recognition model from Facebook's Massively Multilingual Speech project, supporting 1107 languages, based on Wav2Vec2 architecture with adapter technology for multilingual transcription.

Speech Recognition

Transformers Supports Multiple Languages

Faster Whisper Tiny

This is the CTranslate2 converted version of the OpenAI Whisper-tiny model, used for efficient speech recognition tasks.

Speech Recognition Supports Multiple Languages

Whisper Large V2 Malayalam

This is a fine-tuned version of the OpenAI Whisper Large V2 model for Malayalam speech recognition tasks, trained using the Common Voice 11.0 dataset

Speech Recognition

Transformers Other

Wav2vec2 Large Xls R 300m Bn Colab

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-xls-r-300m on the common_voice_9_0 dataset, supporting Bengali.

Speech Recognition

Wav2vec2 Large Multilang Cv Ru

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the common_voice dataset, primarily designed for Russian speech recognition tasks.

Speech Recognition

Wav2vec2 Large Xls R 300m Ta Colab

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset, primarily used for Tamil speech recognition tasks.

Speech Recognition

84rry Xlsr 53 Arabic

This model is a fine-tuned Arabic speech recognition model based on facebook/wav2vec2-large-xlsr-53 on the Common Voice dataset

Speech Recognition

Wav2vec2 Large Xls R 300m Turkish Colab Common Voice 8 6

This is a Turkish speech recognition model based on the wav2vec2 architecture, fine-tuned on the common_voice dataset

Speech Recognition

Dansk Wav2vec21

This model is a Danish speech recognition model fine-tuned by Siyam/SKYLy on the common_voice dataset

Speech Recognition

Wav2vec2 Vorarlbergerisch

A German dialect speech recognition model fine-tuned from facebook/wav2vec2-base-960h, supporting Vorarlberg regional dialect recognition in Austria

Speech Recognition

Wav2vec2 Base MIR ST500 ASR 109

A fine-tuned automatic speech recognition model based on facebook/wav2vec2-base on the MIR_ST500 dataset

Speech Recognition

Wav2vec2 Common Voice Accents Scotland

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset, specializing in Scottish accent speech recognition.

Speech Recognition

Wav2vec2 Common Voice Accents

A speech recognition model fine-tuned on the common_voice dataset based on facebook/wav2vec2-xls-r-300m, supporting multiple accent recognition

Speech Recognition

Wav2vec2 Xls R 100m Common Voice Tr Ft

This model is an automatic speech recognition (ASR) model fine-tuned on the COMMON_VOICE - TR Turkish dataset based on facebook/wav2vec2-xls-r-100m.

Speech Recognition

Transformers Other

patrickvonplaten

Xls R Ab Spanish

This is an automatic speech recognition model fine-tuned on the Abkhazian language dataset based on the XLS-R dummy model

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 129 Turkish Colab

Turkish speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-large-xlsr-129

Speech Recognition

patrickvonplaten

Wav2vec2 Large Xlsr Open Brazilian Portuguese V2

This is a Wav2vec2 model optimized for Brazilian Portuguese, trained on multiple open datasets for automatic speech recognition tasks.

Speech Recognition

Transformers Other

Wav2vec2 Large Xls R 300m Da Colab

A Danish speech recognition model fine-tuned based on Alvenir/wav2vec2-base-da, suitable for Danish speech-to-text tasks

Speech Recognition

Wav2vec2 Large Xls R 300m Guarani Small

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the Common Voice dataset, supporting Guarani speech recognition.

Speech Recognition

Wav2vec2 Xls R 300m W2V2 XLSR 300M YAKUT SMALL

This is a speech recognition model fine-tuned on the Yakut (Sakha) language dataset based on the facebook/wav2vec2-xls-r-300m model

Speech Recognition

Transformers Other

Tamil Wav2Vec Xls R 300m Tamil Colab

This model is a fine-tuned Tamil speech recognition model based on facebook/wav2vec2-xls-r-300m on the Common Voice dataset.

Speech Recognition

Transformers Other

bharat-raghunathan

Wav2vec2 Xlsr Breton

This model is a fine-tuned automatic speech recognition model for Breton based on facebook/wav2vec2-xls-r-1b.

Speech Recognition

Transformers Other

Wav2vec2 Large Xls R 300m Turkish Colab

This is a speech recognition model fine-tuned on the Common Voice Turkish dataset based on the facebook/wav2vec2-xls-r-300m model

Speech Recognition

Wav2vec2 Large Xls R 300m Turkish Colab

A speech recognition model fine-tuned on the Common Voice Turkish dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Wav2vec2 Large Xls R 300m Turkish Colab

A speech recognition model fine-tuned on the Common Voice Turkish dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

patrickvonplaten

Hausa automatic speech recognition model based on wav2vec2-xls-r-300m architecture, fine-tuned on Common Voice 8.0 Hausa dataset

Speech Recognition

Transformers Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase