Model Selection

Long audio processing

# Long audio processing

Lightweight audio model, excelling in speech recognition, audio understanding, and executing audio instructions among other diverse tasks

Transformers English

Whisper Large V3 Vaani Hindi

A Hindi speech recognition model fine-tuned based on OpenAI's Whisper-Large-V3, trained on approximately 718 hours of transcribed Hindi speech data

Speech Recognition

Whisper Large V3 Turbo

Whisper large-v3-turbo is an automatic speech recognition and speech translation model proposed by OpenAI, trained with large-scale weak supervision and supporting multiple languages.

Speech Recognition

Transformers Supports Multiple Languages

Chunkformer Large Vie

A large-scale Vietnamese automatic speech recognition model based on the ChunkFormer architecture, fine-tuned on approximately 3000 hours of publicly available Vietnamese speech data, with excellent performance.

Speech Recognition

Whisper Large V3 Turbo Turkish

A Turkish speech recognition model fine-tuned on the Common Voice 17.0 dataset based on openai/whisper-large-v3-turbo

Speech Recognition

Transformers Other

Whisper Large V3 Turbo

Whisper large-v3-turbo is a distilled version of OpenAI Whisper large-v3, with the decoder layers reduced from 32 to 4, significantly improving speed while slightly reducing quality.

Speech Recognition Supports Multiple Languages

Faster Whisper Large V3 Ru Podlodka Int8

This is a Russian speech recognition model based on the OpenAI Whisper architecture, optimized for Russian speech-to-text tasks and converted to ctranslate2 format for improved inference efficiency.

Speech Recognition Other

Whisper Tiny En

An English speech recognition and translation model optimized for mobile deployment, implemented by Qualcomm.

Speech Recognition

Nb Whisper Base

An automatic speech recognition model developed by the National Library of Norway, based on the OpenAI Whisper architecture, supporting transcription in Norwegian and English.

Speech Recognition

Nb Whisper Large

An automatic Norwegian speech recognition model launched by the National Library of Norway, developed based on OpenAI's Whisper architecture, supporting multiple Norwegian dialects and English.

Speech Recognition

Transformers Supports Multiple Languages

Nb Whisper Large

An automatic speech recognition model developed by the National Library of Norway, based on the Whisper architecture, supporting speech transcription and translation of Norwegian and English.

Speech Recognition

Whisper Large V3

Whisper is an advanced automatic speech recognition (ASR) and speech translation model proposed by OpenAI, trained on over 5 million hours of labeled data, with strong cross-dataset and cross-domain generalization capabilities.

Speech Recognition Supports Multiple Languages

Whisper Tamil Large V2

Tamil speech recognition model fine-tuned based on OpenAI Whisper-large-v2, trained on multiple public Tamil ASR corpora

Speech Recognition Other

Wav2vec2 Large Xls R 300m Bg

An automatic speech recognition model fine-tuned on the Common Voice 8 Bulgarian dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase