W

Wav2vec2 Bartpho

Developed by nguyenvulebinh
This is an automatic speech recognition model supporting Vietnamese, capable of outputting normalized text, timestamp labeling, and multi-speaker segmentation.
Downloads 472
Release Time : 10/5/2023

Model Overview

This model is based on the wav2vec2 and bartpho architecture, specifically designed for Vietnamese automatic speech recognition tasks, supporting timestamped text output and multi-speaker segmentation.

Model Features

Timestamp Labeling
Capable of marking precise timestamps for recognized text
Multi-speaker Segmentation
Supports identification and segmentation of speech from different speakers
Text Normalization
Outputs normalized recognized text

Model Capabilities

Vietnamese speech recognition
Timestamp labeling
Multi-speaker segmentation
Text normalization output

Use Cases

Speech Transcription
News Transcription
Transcribing Vietnamese news broadcasts into timestamped text
Sample output includes precise time markers and segmentation
Meeting Minutes
Multi-speaker Meeting Minutes
Automatically identifying and segmenting speech from different speakers in meetings
Can distinguish between different speakers and mark speaking times
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase