Exp W2v2t En No Pretraining S289
E
Exp W2v2t En No Pretraining S289
Developed by jonatasgrosman
This is a model designed for English speech recognition tasks, based on a randomly initialized wav2vec2 architecture and fine-tuned using the Common Voice 7.0 dataset.
Downloads 18
Release Time : 7/8/2022
Model Overview
This model is primarily used for English speech recognition tasks, capable of converting English speech into text.
Model Features
Random Initialization Training
The model starts training from a randomly initialized wav2vec2 architecture, rather than using pre-trained weights.
16kHz Sampling Rate Support
The model requires input speech to have a sampling rate of 16kHz to ensure accurate speech recognition.
Model Capabilities
English speech recognition
Speech-to-text
Use Cases
Speech Transcription
Speech Transcription
Convert English speech content into text format, suitable for scenarios like meeting minutes, voice notes, etc.
Featured Recommended AI Models