Whisper-large-v3-turbo-turkish Open Source Model - Accurately Recognize Turkish Speech Content

Home

Whisper Large V3 Turbo Turkish

Developed by selimc

A Turkish speech recognition model fine-tuned on the Common Voice 17.0 dataset based on openai/whisper-large-v3-turbo

Speech Recognition

Transformers

OtherOpen Source License:MIT #Turkish speech transcription #Low word error rate #Long audio processing

Downloads 289

Release Time : 10/8/2024

Model Overview

This model is specifically optimized for Turkish speech transcription tasks, suitable for scenarios such as voice command recognition and automatic subtitle generation

Model Features

Turkish Optimization

Specially fine-tuned for Turkish, improving recognition accuracy for this language

Efficient Training

Completed effective training with limited resources, reducing the word error rate to 18.92%

Lightweight Deployment

Supports FP16 precision and can run efficiently on consumer-grade GPUs

Model Capabilities

Turkish speech transcription

Voice command recognition

Automatic subtitle generation

Use Cases

Speech Transcription

Meeting Minutes

Automatically convert Turkish meeting recordings into text records

Word error rate 18.92%

Video Subtitles

Generate automatic subtitles for Turkish videos

🚀 Whisper Large v3 Turbo TR - Selim Çavaş

This model is a fine - tuned version of openai/whisper-large-v3-turbo on the Common Voice 17.0 dataset. It can be used for Turkish speech recognition tasks, achieving high accuracy in transcribing Turkish audio.

🚀 Quick Start

Basic Usage

import torch
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline

device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32

model_id = "selimc/whisper-large-v3-turbo-turkish"

model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
)
model.to(device)

processor = AutoProcessor.from_pretrained(model_id)

pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    chunk_length_s=30,
    batch_size=16,
    return_timestamps=True,
    torch_dtype=torch_dtype,
    device=device,
)

result = pipe("test.mp3")
print(result["text"])

✨ Features

Versatile Applications: This model can be used in various application areas, including transcription of Turkish language, voice commands, and automatic subtitling for Turkish videos.
High Accuracy: Achieves a Wer of 18.9229 on the test set of the Common Voice 17.0 dataset.

📦 Installation

The code example above assumes you have installed the necessary libraries. You can install them using the following command:

pip install transformers torch datasets tokenizers

📚 Documentation

Intended uses & limitations

This model can be used in various application areas, including:

Transcription of Turkish language
Voice commands
Automatic subtitling for Turkish videos

How To Use

The code example in the "Quick Start" section demonstrates how to use the model for automatic speech recognition.

Training

Due to colab GPU constraints, only 25% of the Turkish data available in the Common Voice 17.0 dataset was used for training. 😔

Got a GPU to spare? Let's collaborate and take this model to the next level! 🚀

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e - 05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 4000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.1223	1.6	1000	0.3187	24.4415
0.0501	3.2	2000	0.3123	20.9720
0.0226	4.8	3000	0.3010	19.6183
0.001	6.4	4000	0.3123	18.9229

Framework versions

Transformers 4.45.2
Pytorch 2.4.1+cu121
Datasets 3.0.1
Tokenizers 0.20.1

📄 License

This model is released under the MIT license.

📦 Model Information

Property	Details
Library Name	transformers
Base Model	openai/whisper-large-v3-turbo
Tags	generated_from_trainer
Datasets	mozilla-foundation/common_voice_17_0
Metrics	wer
Model Name	Whisper Large v3 Turbo TR - Selim Çavaş
Evaluation Results	Loss: 0.3123, Wer: 18.9229

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご