Distil-large-v3.5-ct2 Open-source Speech Recognition Model - Achieve Efficient Speech Recognition for Free

Distil Large V3.5 Ct2

Developed by distil-whisper

Distil-Whisper is a distilled version of the Whisper model, achieving efficient speech recognition through large-scale pseudo-labeling technology

Speech Recognition EnglishOpen Source License:MIT #Efficient Speech Recognition #Multilingual Support #Low-latency Inference

Downloads 264

Release Time : 3/14/2025

Model Overview

An efficient speech recognition model optimized through distillation of the Whisper model, converted to CTranslate2 format for faster inference speed

Model Features

Efficient Inference

Optimized with the CTranslate2 engine, offering faster inference speed compared to the original Whisper model

Knowledge Distillation

Distills knowledge from the Whisper model using large-scale pseudo-labeling technology while maintaining high accuracy

Hardware Adaptation

Supports both CPU and GPU operation, automatically selecting the optimal computation type (float16/float32)

Model Capabilities

English Speech Recognition

Audio File Transcription

Real-time Speech-to-Text

Use Cases

Speech Transcription

Meeting Minutes

Automatically convert meeting recordings into text transcripts

Podcast Transcription

Convert podcast audio content into searchable text

Assistive Tools

Subtitle Generation

Automatically generate English subtitles for video content

🚀 Distil-Whisper: Distil-Large-v3.5 for CTranslate2

This repository provides the model weights of distil-large-v3.5 converted to CTranslate2 format. CTranslate2 is a high - speed inference engine for Transformer models and serves as the supported backend for the Faster-Whisper package.

🚀 Quick Start

This repository contains the model weights for distil-large-v3.5 converted to CTranslate2 format. CTranslate2 is a fast inference engine for Transformer models and is the supported backend for the Faster-Whisper package.

✨ Features

Converted model weights of distil-large-v3.5 to CTranslate2 format.
Compatible with the Faster-Whisper package for efficient automatic speech recognition.

📦 Installation

To use the model in Faster-Whisper, first install the PyPi package according to the official instructions.

For this example, we'll also install 🤗 Datasets to load a toy audio dataset from the Hugging Face Hub:

pip install --upgrade pip
pip install --upgrade git+https://github.com/SYSTRAN/faster-whisper datasets[audio]

💻 Usage Examples

Basic Usage

import torch
from faster_whisper import WhisperModel
from datasets import load_dataset

# define our torch configuration
device = "cuda" if torch.cuda.is_available() else "cpu"
compute_type = "float16" if torch.cuda.is_available() else "float32"

# load model on GPU if available, else cpu
model = WhisperModel("distil-whisper/distil-large-v3.5-ct2", device=device, compute_type=compute_type)

# load toy dataset for example
dataset = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = dataset[1]["audio"]["path"]

segments, info = model.transcribe(sample, beam_size=5, language="en")

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

Advanced Usage

To transcribe a local audio file, simply pass the path to the audio file as the audio argument to transcribe:

segments, info = model.transcribe("audio.mp3", beam_size=5, language="en")

📚 Documentation

For more information about the Distil-Large-v3.5 model, refer to the original model card.

📄 License

Distil-Whisper inherits the MIT license from OpenAI's Whisper model.

📚 Citation

If you use this model, please consider citing the Distil-Whisper paper:

@misc{gandhi2023distilwhisper,
      title={Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling}, 
      author={Sanchit Gandhi and Patrick von Platen and Alexander M. Rush},
      year={2023},
      eprint={2311.00430},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご