Whisper-small-slovenian Open-source Speech Recognition Model - Free Conversion of Slovenian Speech to Text

Whisper Small Slovenian

Developed by samolego

This model is a fine-tuned speech recognition model based on openai/whisper-small on the Slovenian ASR dataset ARTUR 1.0, supporting Slovenian speech-to-text tasks.

Speech Recognition

Transformers

OtherOpen Source License:Apache-2.0 #Slovenian ASR #Low Word Error Rate #Audio Transcription Optimization

Downloads 24

Release Time : 3/22/2024

Model Overview

This is an automatic speech recognition (ASR) model specifically optimized for Slovenian, fine-tuned based on OpenAI's Whisper Small architecture, suitable for Slovenian speech transcription tasks.

Model Features

Slovenian Optimization

Specifically fine-tuned for Slovenian, excelling in Slovenian speech recognition tasks.

Multi-format Support

Provides both ggml and safetensors formats for easy deployment in different scenarios.

Lightweight Model

Based on the Whisper Small architecture, it maintains high accuracy while having a small model size.

Model Capabilities

Slovenian Speech Recognition

Speech-to-Text

Audio Transcription

Use Cases

Speech Transcription

Meeting Minutes

Automatically transcribe Slovenian meeting recordings into text records.

Word Error Rate 11.0097%

Media Subtitle Generation

Automatically generate subtitles for Slovenian video content.

🚀 Whisper Small Sl - samolego

This model is a fine - tuned version of openai/whisper-small on the ASR database ARTUR 1.0 (audio) dataset. It addresses the need for accurate speech - to - text conversion in the Slovenian language, leveraging the pre - trained capabilities of the base model and fine - tuning them on specific datasets. It achieves notable results on the evaluation set, including a low loss and a relatively good word error rate (Wer).

✨ Features

Multiple Formats: Both ggml and safetensors formats are available.
Fine - Tuned Performance: Achieves a Loss of 0.1226 and a Wer of 11.0097 on the evaluation set.

📦 Installation

No installation steps were provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples were provided in the original document, so this section is skipped.

📚 Documentation

Model description

Both ggml and safetensors formats are available.

If you're not familiar with ggml, I'd suggest checking out whisper.cpp.

Intended uses & limitations

More information needed

Training and evaluation data

Verdonik, Darinka; et al., 2023, ASR database ARTUR 1.0 (audio), Slovenian language resource repository CLARIN.SI, ISSN 2820 - 4042, http://hdl.handle.net/11356/1776.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e - 05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.2778	0.07	500	0.2748	23.0421
0.2009	0.14	1000	0.1972	17.3073
0.1643	0.21	1500	0.1658	14.5195
0.1569	0.28	2000	0.1495	13.1550
0.1344	0.36	2500	0.1380	12.2945
0.1295	0.43	3000	0.1302	11.6237
0.1239	0.5	3500	0.1249	11.2128
0.1178	0.57	4000	0.1226	11.0097

Framework versions

Transformers 4.39.0.dev0
Pytorch 2.0.1+cu117
Datasets 2.18.0
Tokenizers 0.15.2

🔧 Technical Details

The model is a fine - tuned version of [openai/whisper - small](https://huggingface.co/openai/whisper - small) on specific ASR databases. It uses a set of well - defined hyperparameters during training, such as a learning rate of 5e - 05 and an Adam optimizer. The training was carried out over 3 epochs, and the model was evaluated on specific metrics like Loss and Wer.

📄 License

This model is released under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご