C

Chinese Hubert Base

Developed by TencentGameMate
A Chinese speech model pretrained on 10,000 hours of WenetSpeech L subset, suitable for speech-related tasks
Downloads 1,312
Release Time : 6/2/2022

Model Overview

This is a pretrained model for Chinese speech data, adopting Wav2Vec2/HuBERT architecture, which can be used for tasks like speech feature extraction. Fine-tuning with tokenizers and labeled data is required for speech recognition applications.

Model Features

Large-Scale Chinese Pretraining
Pretrained on 10,000 hours of Chinese speech data (WenetSpeech L subset)
Lightweight Deployment
Supports half-precision inference to reduce computational resource requirements
Flexible Adaptation
Can serve as a foundation model adaptable to various downstream speech tasks

Model Capabilities

Speech Feature Extraction
Speech Representation Learning

Use Cases

Speech Processing
Speech Recognition Foundation Model
Can be fine-tuned to build Chinese speech recognition systems
Requires fine-tuning with tokenizers and labeled data
Speech Feature Extraction
Extracts high-level feature representations of speech
Can be used for subsequent speech analysis tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase