V

Viwhisper Medium

Developed by NhutP
Whisper-medium model optimized for Vietnamese speech recognition tasks, fine-tuned on 1308 hours of Vietnamese data
Downloads 139
Release Time : 12/16/2024

Model Overview

Vietnamese speech recognition model based on OpenAI Whisper-medium architecture, fine-tuned on multiple Vietnamese datasets, supporting high-accuracy speech-to-text conversion

Model Features

Large-scale Vietnamese training
Fine-tuned on 1308 hours of Vietnamese data, including speech data from various sources
Multi-dataset support
Evaluated on multiple Vietnamese datasets including VSV-1100, Common Voice, and VIVOS
Low WER performance
Achieves WER of 4.69-28.76 on multiple test sets, demonstrating excellent Vietnamese recognition

Model Capabilities

Vietnamese speech recognition
Long audio processing
High-accuracy transcription

Use Cases

Speech transcription
Vietnamese meeting minutes
Automatically convert Vietnamese meeting recordings into text transcripts
WER as low as 4.69-8.1
Voice assistants
Provide speech recognition capabilities for Vietnamese voice assistants
Education
Language learning applications
Help learners practice Vietnamese pronunciation and listening
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase