Whisper-large-v2 Open-Source Speech Recognition Model - Accurately Identify Wolof Speech in Senegal

Whosper Large V2

Developed by CAYTU

Whosper-large-v2 is a cutting-edge speech recognition model specifically designed for Wolof, the primary language of Senegal. Built upon OpenAI's Whisper-large-v2, it significantly improves Word Error Rate (WER) and Character Error Rate (CER).

Speech Recognition

Safetensors

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Wolof speech recognition #Code-switching optimization #African language processing

Downloads 449

Release Time : 1/15/2025

Model Overview

This model focuses on Wolof speech recognition while also supporting French and English, featuring exceptional code-switching capabilities, making it suitable for transcribing conversations, building language learning tools, or conducting research.

Model Features

Exceptional Code-Switching

Naturally handles mixing between Wolof and French/English, reflecting real-world speech patterns.

Multilingual Support

Performs equally well in French and English, in addition to Wolof.

Production-Ready

Thoroughly tested and optimized, suitable for deployment.

Open Source

Released under the apache-2.0 license, ideal for research and development.

Focus on African NLP

Committed to the broader goal of supporting African languages.

Model Capabilities

Wolof speech recognition

French speech recognition

English speech recognition

Code-switching processing

Use Cases

Speech Transcription

Conversation Transcription

Transcribe Wolof conversation content

WER 0.2345, CER 0.1101

Education

Language Learning Tools

Build speech recognition components for Wolof learning applications

Research

African Language Processing Research

Used for research related to African language speech recognition

🚀 Whosper-large-v2

Whosper-large-v2 is a state - of - the - art speech recognition model designed for Wolof, with significant improvements in WER and CER, suitable for researchers, developers, and students working with Wolof speech data.

🚀 Quick Start

Installation

pip install git+https://github.com/sudoping01/whosper.git

Basic Usage

from whosper import WhosperTranscriber

# Initialize the transcriber
transcriber = WhosperTranscriber(model_id="CAYTU/whosper-large-v2") 

# Transcribe an audio file
result = transcriber.transcribe_audio("path/to/your/audio.wav")
print(result)

✨ Features

Superior Code - Switching: Handles natural Wolof - French/English mixing, mirroring real - world speech patterns.
Multilingual: Performs well in French and English in addition to Wolof.
Production - Ready: Thoroughly tested and optimized for deployment.
Open Source: Released under the [apache - 2.0](https://www.apache.org/licenses/LICENSE - 2.0) license, perfect for research and development.
African NLP Focus: Contributing to the broader goal of comprehensive African language support.
Improved WER and CER compared to [whosper - large](https://huggingface.co/sudoping01/whosper - large).
Optimized for Wolof and French recognition.
Enhanced performance on bilingual content.

📦 Installation

pip install git+https://github.com/sudoping01/whosper.git

💻 Usage Examples

Basic Usage

from whosper import WhosperTranscriber

# Initialize the transcriber
transcriber = WhosperTranscriber(model_id="CAYTU/whosper-large-v2") 

# Transcribe an audio file
result = transcriber.transcribe_audio("path/to/your/audio.wav")
print(result)

📚 Documentation

Model Overview

Whosper - large - v2 is a cutting - edge speech recognition model tailored for Wolof, Senegal's primary language. Built on OpenAI's [Whisper - large - v2](https://huggingface.co/openai/whisper - large - v2), it advances African language processing with notable improvements in Word Error Rate (WER) and Character Error Rate (CER). Whether you're transcribing conversations, building language learning tools, or conducting research, this model is designed for researchers, developers, and students working with Wolof speech data.

Performance Metrics

WER: 0.2345
CER: 0.1101

Lower values mean better accuracy—ideal for practical applications!

Performance Comparison

Metric	Whosper - large - v2	Whosper - large	Improvement
WER	0.2345	0.2423	3.2% better
CER	0.1101	0.1135	3.0% better

Training Results

Training Loss	Epoch	Step	Validation Loss
0.7575	0.9998	2354	0.7068
0.6429	1.9998	4708	0.6073
0.5468	2.9998	7062	0.5428
0.4439	3.9998	9416	0.4935
0.3208	4.9998	11770	0.4600
0.2394	5.9998	14124	0.4490

Framework Versions

PEFT: 0.14.1.dev0
Transformers: 4.49.0.dev0
PyTorch: 2.5.1+cu124
Datasets: 3.2.0
Tokenizers: 0.21.0

Model Information

Property	Details
Model Type	Automatic Speech Recognition
Base Model	openai/whisper - large - v2
Tags	generated_from_trainer, multilingual, ASR, Open - Source
Languages Supported	wo, fr, en
Pipeline Tag	automatic - speech - recognition

Model Index

Name: whosper - large - v2
- Results:
  - Task:
    - Name: Automatic Speech Recognition
    - Type: automatic - speech - recognition
  - Dataset:
    - Name: Test Set
    - Type: custom
    - Split: test
    - Args:
      - Language: wo
  - Metrics:
    - Name: Test WER
    - Type: wer
    - Value: 23.45
    - Name: Test CER
    - Type: cer
    - Value: 11.01

Limitations

Reduced performance on English compared to [whosper - large](https://huggingface.co/sudoping01/whosper - large).
Less effective for general multilingual content compared to [whosper - large](https://huggingface.co/sudoping01/whosper - large).
Low performances on very bad audios quality.

Training Data

Trained on diverse Wolof speech data:

ALFFA Public Dataset
FLEURS Dataset
Bus Urbain Dataset
Anta Women TTS Dataset
Kallama Dataset

This diversity ensures the model excels across:

Speaking styles and dialects
Code - switching patterns
Gender and age groups
Recording conditions

Contributing to African NLP

Whosper - large - v2 embodies our commitment to open science and the advancement of African language technologies. We believe that by making cutting - edge speech recognition models freely available, we can accelerate NLP development across Africa.

Join our mission to democratize AI technology:

Open Science: Use and build upon our research - all code, models, and documentation are open source.
Data Contribution: Share your Wolof speech datasets to help improve model performance.
Research Collaboration: Integrate Whosper into your research projects and share your findings.
Community Building: Help us create resources for African language processing.
Educational Impact: Use Whosper in educational settings to train the next generation of African AI researchers.

Together, we can ensure African languages are well - represented in the future of AI technology. Whether you're a researcher, developer, educator, or language enthusiast, your contributions can help bridge the technological divide.

📄 License

[Apache License 2.0](https://www.apache.org/licenses/LICENSE - 2.0)

This model is released under the Apache 2.0 license to encourage research, commercial use, and innovation in African language technologies while ensuring proper attribution and patent protection. You are free to:

Use the model commercially.
Modify and distribute the model.
Create derivative works.
Use the model for patent purposes.

Choosing Apache 2.0 aligns with our goals of open science and advancing African NLP while providing necessary protections for the community.

📚 Citation

@misc{whosper2025,
  title={Whosper-large: A Multilingual ASR Model for Wolof with Enhanced Code-Switching Capabilities},
  author={Seydou DIALLO},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/CAYTU/whosper-large},
  version={1.0}
}

🙏 Acknowledgments

Developed by [Seydou DIALLO](https://www.linkedin.com/in/seydou - diallo - 08ab311ba) at Caytu Robotics's AI Department, building on OpenAI's [Whisper - large - v2](https://huggingface.co/openai/whisper - large - v2). Special thanks to the Wolof - speaking community and contributors advancing African language technology.

📞 Contact US

For any question or support contact us

Email : sdiallo@caytu.com

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご