S

Skyly

Developed by Siyam
SKYLy is a speech recognition model fine-tuned on the common_voice dataset based on facebook/wav2vec2-large-xlsr-53, achieving a word error rate (WER) of 0.4083 on the evaluation set.
Downloads 26
Release Time : 5/1/2022

Model Overview

This model is a speech recognition (ASR) model, primarily used to convert speech into text. It is fine-tuned based on the wav2vec2-large-xlsr-53 architecture and supports multilingual speech recognition.

Model Features

Low Word Error Rate
Achieved a word error rate (WER) of 0.4083 on the evaluation set, demonstrating excellent performance
Based on wav2vec2 Architecture
Uses facebook's wav2vec2-large-xlsr-53 as the base model, featuring robust speech feature extraction capabilities
Multilingual Support
Trained on the common_voice multilingual dataset, supporting speech recognition in multiple languages

Model Capabilities

Speech-to-Text
Multilingual Speech Recognition
Real-time Speech Processing

Use Cases

Speech Transcription
Automatic Meeting Transcription
Automatically converts meeting recordings into text transcripts
Approximately 60% accuracy (inferred based on WER 0.4)
Voice Assistant
Used as the speech recognition module for voice control systems
Accessibility Applications
Hearing Assistance Tool
Provides real-time speech-to-text services for the hearing impaired
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase