viwav2vec2-base-1.5k Open-source Vietnamese Speech Model - Empowering Precise Vietnamese Speech Recognition

Viwav2vec2 Base 1.5k

Developed by dragonSwing

This model is pretrained on 1.5k hours of Vietnamese speech data, suitable for Vietnamese speech recognition tasks, requires fine-tuning before use.

Speech Recognition

Transformers

Other#Vietnamese speech recognition #1.5k hours pretraining #16kHz sampling

Downloads 38

Release Time : 5/3/2022

Model Overview

Vietnamese speech pretrained model based on Wav2Vec2 architecture, trained on 1.5k hours of read and broadcast speech data, supports 16kHz sampled speech input.

Model Features

Large-scale Vietnamese pretraining

Pretrained on 1.5k hours of Vietnamese speech data, covering read and broadcast speech

16kHz sampling support

Optimized for 16kHz sampled speech data, ensure input speech matches this sampling rate

Requires fine-tuning

The model needs fine-tuning on downstream tasks (e.g. Vietnamese ASR) for optimal performance

Model Capabilities

Vietnamese speech feature extraction

Speech representation learning

Use Cases

Speech technology

Vietnamese speech recognition system

Build Vietnamese ASR system by fine-tuning the model

Speech analysis

For Vietnamese speech feature analysis and representation learning

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Viwav2vec2 Base 1.5k

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Wav2Vec2 Base Model Trained on 1.5K Hours of Vietnamese Speech

🚀 Quick Start

💻 Usage Examples

Basic Usage

Advanced Usage

📄 License

📚 Documentation