Whisper-tiny-german Open-Source German Speech Recognition Model - Free Deployment, Suitable for Edge Scenarios

Whisper Tiny German

Developed by primeline

A German speech recognition model based on whisper-tiny, with 37.8 million parameters, suitable for edge scenarios sensitive to model size.

Speech Recognition

Transformers

GermanOpen Source License:Apache-2.0 #German Speech Recognition #Lightweight Model #Edge Computing

Downloads 198

Release Time : 4/15/2024

Model Overview

A lightweight model specifically designed for German speech recognition tasks, suitable for edge computing scenarios requiring small models, but not recommended for mission-critical applications.

Model Features

Lightweight Design

The model size is only 73MB (bfloat16 format), suitable for deployment on edge devices.

German Optimization

Specifically trained and optimized for German speech recognition tasks.

Multi-Source Training

Trained using Common Voice, Multilingual LibriSpeech, and internal data.

Model Capabilities

German Speech Recognition

Edge Device Deployment

Real-Time Speech-to-Text

Use Cases

Edge Computing

Mobile Voice Input

Enables German voice input on resource-constrained mobile devices.

Embedded Device Voice Control

Provides localized German voice control for embedded devices like smart home appliances.

🚀 Whisper Tiny German

This is a German speech recognition model based on the whisper-tiny model, designed for edge cases where model size is a concern.

🚀 Quick Start

The following is an example of how to use the whisper-tiny-german model for automatic speech recognition:

import torch
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
from datasets import load_dataset
device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
model_id = "primeline/whisper-tiny-german"
model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
)
model.to(device)
processor = AutoProcessor.from_pretrained(model_id)
pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    max_new_tokens=128,
    chunk_length_s=30,
    batch_size=16,
    return_timestamps=True,
    torch_dtype=torch_dtype,
    device=device,
)
dataset = load_dataset("distil-whisper/librispeech_long", "clean", split="validation")
sample = dataset[0]["audio"]
result = pipe(sample)
print(result["text"])

✨ Features

Compact Size: The model weights have 37.8M parameters and are only 73MB in bfloat16 format, suitable for edge devices with limited resources.
German Speech Recognition: Specifically designed for German speech recognition tasks.

📚 Documentation

Intended uses & limitations

The model is intended for German speech recognition tasks, especially in edge cases where model size is a concern. However, it's not recommended for critical use cases as it may not perform well in all scenarios due to its small size.

Dataset

The dataset used for training is a filtered subset of the Common Voice dataset, multilingual librispeech, and some internal data. The data was carefully filtered and double - checked for quality and correctness, and text data normalization was performed, especially for casing and punctuation.

Model family

Property	Details
Model Type	German Speech Recognition Model
Training Data	Filtered subset of Common Voice dataset, multilingual librispeech, and some internal data

Model	Parameters	link
Whisper large v3 german	1.54B	link
Whisper large v3 turbo german	809M	link
Distil-whisper large v3 german	756M	link
tiny whisper	37.8M	link

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e - 05
total_train_batch_size: 512
num_epochs: 5.0

Framework versions

Transformers 4.39.3
Pytorch 2.3.0a0+ebedce2
Datasets 2.18.0
Tokenizers 0.15.2

📄 License

This model is published under the Apache-2.0 license.

About us

Your partner for AI infrastructure in Germany. Experience the powerful AI infrastructure that drives your ambitions in Deep Learning, Machine Learning & High - Performance Computing. Optimized for AI training and inference.

Model author: Florian Zimmermeister

Disclaimer

This model is not a product of the primeLine Group. 

It represents research conducted by [Florian Zimmermeister](https://huggingface.co/flozi00), with computing power sponsored by primeLine. 

The model is published under this account by primeLine, but it is not a commercial product of primeLine Solutions GmbH.

Please be aware that while we have tested and developed this model to the best of our abilities, errors may still occur. 

Use of this model is at your own risk. We do not accept liability for any incorrect outputs generated by this model.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご