E

Exp W2v2t Zh Cn Wavlm S596

Developed by jonatasgrosman
A Chinese speech recognition model fine-tuned based on microsoft/wavlm-large, supporting Simplified Chinese, trained using the Common Voice 7.0 (zh-CN) dataset.
Downloads 22
Release Time : 7/10/2022

Model Overview

This model is optimized for Chinese Simplified speech recognition tasks, fine-tuned based on the WavLM-large architecture, and suitable for 16kHz sampled speech input.

Model Features

Based on WavLM-large Architecture
Utilizes Microsoft's WavLM-large pre-trained model as the foundation, offering robust speech feature extraction capabilities.
Optimized for Chinese Speech Recognition
Specifically fine-tuned for Simplified Chinese speech, delivering superior performance in Chinese speech recognition tasks.
16kHz Sampling Rate Support
Compatible with common 16kHz sampled speech input, facilitating practical deployment.

Model Capabilities

Chinese Speech Recognition
Speech-to-Text

Use Cases

Speech Transcription
Meeting Minutes Transcription
Automatically transcribe Chinese meeting recordings into text records
Voice Input Method
Supports converting voice input into text
Accessibility Applications
Real-time Caption Generation
Generate Chinese subtitles for video content or real-time conversations
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase