đ Wav2Vec2-base-VoxPopuli-V2
This project presents a pre - trained base model of Facebook's Wav2Vec2. It is exclusively pre - trained on 14.4k unlabeled data in the Lithuanian language (lt
) from the VoxPopuli corpus. The model offers a promising solution for automatic speech recognition tasks in Lithuanian.
⨠Features
- Language - Specific Pretraining: Pretrained only on Lithuanian data from the VoxPopuli corpus, making it well - suited for Lithuanian speech processing.
- Audio Sampling Requirement: Designed to work with 16kHz sampled speech audio, ensuring high - quality speech input compatibility.
đĻ Installation
No specific installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
No code examples are provided in the original document, so this section is skipped.
đ Documentation
General Information
The model is pre - trained on 16kHz sampled speech audio. When using the model, ensure that your speech input is also sampled at 16kHz.
Model Limitation
This model does not have a tokenizer as it was pretrained on audio alone. To use this model for speech recognition, a tokenizer should be created, and the model should be fine - tuned on labeled text data in Lithuanian. For a more in - depth explanation of how to fine - tune the model, check out [this blog](https://huggingface.co/blog/fine - tune - xlsr - wav2vec2).
Paper and Authors
Additional Information
For more information, visit the official website here.
đ§ Technical Details
No specific technical implementation details are provided in the original document, so this section is skipped.
đ License
The model is released under the cc - by - nc - 4.0
license.
Property |
Details |
Model Type |
Wav2Vec2 - base - VoxPopuli - V2 |
Training Data |
14.4k unlabeled data in Lithuanian from the VoxPopuli corpus |
License |
cc - by - nc - 4.0 |
â ī¸ Important Note
The model is pre - trained on 16kHz sampled speech audio. Make sure your speech input is also sampled at 16kHz.
đĄ Usage Tip
To use this model for speech recognition, create a tokenizer and fine - tune the model on labeled text data in Lithuanian. Refer to [this blog](https://huggingface.co/blog/fine - tune - xlsr - wav2vec2) for detailed fine - tuning instructions.