Model Selection

Speech-to-Text

# Speech-to-Text

Whisper Large V3 Turbo

An ONNX-optimized Whisper large speech recognition model designed for web deployment

Speech Recognition

W2V2 BERT Withlm Malayalam

A Malayalam automatic speech recognition model fine-tuned based on facebook/w2v-bert-2.0, trained on multiple Malayalam datasets and using a trigram language model trained with the KENLM library.

Speech Recognition

Transformers Other

Whisper is an automatic speech recognition (ASR) system trained by OpenAI, supporting multilingual speech transcription.

Speech Recognition

WHISPER SMALL SWAHILI ASR CV 14

This model is a fine-tuned speech recognition model based on OpenAI's Whisper large on the Common Voice 14.0 Swahili (SW) dataset, achieving a word error rate (WER) of 25.13%.

Speech Recognition

Transformers Other

Faster Distil Whisper Large V3

Distilled version of Whisper Large v3 for efficient automatic speech recognition (ASR)

Speech Recognition English

This is a version converted from the GGML format of openai/whisper-tiny to Ratchet's custom format

Speech Recognition

Audiosangraha Audio To Text

A speech-to-text model fine-tuned based on openai/whisper-small, supporting audio translation and text generation tasks.

Speech Recognition

Whisper Small Ml

This model is a fine-tuned version of openai/whisper-small for speech recognition, supporting multiple languages and suitable for automatic speech recognition tasks.

Speech Recognition

Speecht5 Tts Marathi

This is a model for Marathi speech processing, potentially involving speech recognition or speech synthesis tasks.

Speech Recognition

Whisper Medium is a medium-scale speech recognition model developed by OpenAI, supporting automatic speech recognition (ASR) tasks in multiple languages.

Speech Recognition

Whisper Small is a small automatic speech recognition (ASR) model developed by OpenAI, capable of converting speech into text.

Speech Recognition

Whisper is an automatic speech recognition (ASR) system trained by OpenAI, supporting speech-to-text tasks in multiple languages.

Speech Recognition

Whisper Tiny is a lightweight speech recognition model open-sourced by OpenAI, suitable for web deployment.

Speech Recognition

A SpeechT5 automatic speech recognition model fine-tuned on the LibriSpeech dataset, supporting speech-to-text conversion.

Speech Recognition

Whisper is a pre-trained automatic speech recognition (ASR) and speech translation model, trained on 680k hours of labeled data with strong generalization capabilities.

Speech Recognition Supports Multiple Languages

Wav2vec2 Xls R 300m Mrbrown Finetune1

A speech recognition model fine-tuned using the uob_singlish dataset based on the facebook/wav2vec2-xls-r-300m pre-trained model

Speech Recognition

Wav2vec2 Large Xls R 300m Turkish Colab

This model is a Turkish speech recognition model fine-tuned on the Common Voice Turkish dataset based on facebook/wav2vec2-xls-r-300m, achieving a word error rate of 30.95% on the evaluation set.

Speech Recognition

Wav2vec2 Large 960h

Wav2Vec2 is a speech recognition model developed by Facebook. It learns speech representations from raw audio through self-supervised learning and is fine-tuned on the LibriSpeech dataset to achieve high-accuracy speech transcription.

Speech Recognition

Transformers English

Wav2vec2 2 Bart Base

A speech recognition model fine-tuned on the LibriSpeech ASR clean dataset, based on wav2vec2-base and bart-base

Speech Recognition

patrickvonplaten

Bp Cetuc100 Xlsr

Wav2vec2 model fine-tuned for Brazilian Portuguese using the CETUC dataset, trained with approximately 145 hours of Brazilian Portuguese speech data

Speech Recognition

Transformers Other

Asr Hubert Cluster Bart Base

An automatic speech recognition model based on Hubert and BART architecture, achieving speech-to-text conversion through clustered feature transformation

Speech Recognition

Transformers Supports Multiple Languages

Wav2vec2 Large Xls R 300m Ar

A speech recognition model fine-tuned on the Common Voice Arabic dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Wav2vec2 Tiny Random

A lightweight randomly initialized Wav2Vec2 model for speech recognition, primarily for testing and development purposes

Speech Recognition

patrickvonplaten

A fine-tuned Facebook wav2vec2 model for the speech-to-text module of The Sound of AI Open Source Research Group

Speech Recognition

Transformers English

Wav2vec2 Xls R 300m Kh

This is a baseline model for Khmer automatic speech recognition (ASR), designed to provide foundational support for Khmer speech recognition tasks.

Speech Recognition

Wav2vec2 2 Bart Large

This model is an automatic speech recognition (ASR) model fine-tuned on the librispeech_asr-clean dataset, based on wav2vec2-large-lv60 and bart-large

Speech Recognition

patrickvonplaten

Wav2vec2 Large 100k Voxpopuli Ft Common Voice Plus TTS Dataset Russian

This is a speech recognition model based on Facebook's wav2vec2-large-100k-voxpopuli, fine-tuned using Common Voice 7.0 and M-AILABS Russian data.

Speech Recognition

Transformers Other

Waynehills STT Doogie Server

A fine-tuned speech recognition model based on Doogie/Waynehills-STT-doogie-server

Speech Recognition

Xls R 300m Ur Cv8 Hi

This is an Urdu automatic speech recognition model based on the wav2vec2 architecture, fine-tuned on the Common Voice 8.0 Urdu dataset

Speech Recognition

Transformers Other

HarrisDePerceptron

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase