W

Wavlm Bart

Developed by nguyenvulebinh
A sequence-to-sequence model supporting English automatic speech recognition (ASR), capable of outputting normalized text, timestamp annotations, and multi-speaker segmentation.
Downloads 24
Release Time : 5/23/2023

Model Overview

This model is based on the wav2vec2 and bartpho architectures, primarily designed for English speech recognition tasks, supporting output with timestamped text and multi-speaker segmentation.

Model Features

Timestamp annotation
Capable of annotating recognized text with precise timestamps.
Multi-speaker segmentation
Supports recognizing and segmenting speech from different speakers.
Normalized text output
Outputs normalized text results.

Model Capabilities

English speech recognition
Timestamp annotation
Multi-speaker segmentation

Use Cases

Speech transcription
Meeting minutes
Convert meeting recordings into timestamped text records.
Accurately recognizes speech content and annotates speaking time points.
Interview transcription
Transcribe interview recordings and distinguish between different speakers.
Automatically segments speeches from different interviewees.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase