# Unsupervised pretraining
Depth Anything V2 Small
Apache-2.0
Depth Anything V2 is currently the most powerful monocular depth estimation model, trained on large-scale synthetic and real images. Compared to V1, it captures finer details and is more robust.
3D Vision English
D
depth-anything
55.22k
64
Viwav2vec2 Base 1.5k
This model is pretrained on 1.5k hours of Vietnamese speech data, suitable for Vietnamese speech recognition tasks, requires fine-tuning before use.
Speech Recognition
Transformers Other

V
dragonSwing
38
0
Wav2vec2 Base 10k Voxpopuli
A foundational speech recognition model pretrained on 10,000 hours of unlabeled data from the VoxPopuli corpus, supporting multilingual speech processing
Speech Recognition
Transformers Other

W
facebook
2,504
0
Wav2vec2 Base Sl Voxpopuli V2
This is a speech model based on Facebook's Wav2Vec2 architecture, specifically pretrained for Slovenian (sl) using 11.3k hours of unlabeled data from the VoxPopuli corpus.
Speech Recognition
Transformers Other

W
facebook
31
0
Wav2vec2 Base Pl Voxpopuli V2
Polish Wav2Vec2 base model trained on VoxPopuli corpus, suitable for speech recognition tasks
Speech Recognition
Transformers Other

W
facebook
30
0
T5 V1 1 Small
Apache-2.0
T5 Version 1.1 is Google's improved text-to-text conversion model, using the GEGLU activation function, pretrained unsupervised only on the C4 dataset, and requires fine-tuning for use.
Large Language Model English
T
google
127.68k
26
Wav2vec2 Base Pt Voxpopuli V2
Wav2Vec2 base model pretrained on Portuguese VoxPopuli corpus, suitable for speech recognition tasks
Speech Recognition
Transformers Other

W
facebook
30
0
Wav2vec2 Large Romance Voxpopuli V2
Facebook's Wav2Vec2 large model, pretrained only on 101.5 hours of unlabeled data from the Romance language VoxPopuli corpus, suitable for speech recognition tasks.
Speech Recognition
Transformers

W
facebook
26
0
Wav2vec2 Large Mt Voxpopuli V2
Facebook's Wav2Vec2 large model, pretrained exclusively on unlabeled data from the VoxPopuli corpus for Maltese (mt), suitable for speech recognition tasks.
Speech Recognition
Transformers Other

W
facebook
25
0
Mgpt
mGPT is a multilingual generation model pretrained on the mC4 dataset, supporting 101 languages, using a Transformer architecture similar to GPT-2.
Large Language Model
Transformers

M
THUMT
147
8
Wav2vec2 Base Sv Voxpopuli
A Wav2Vec2 base model pretrained on the Swedish subset of the VoxPopuli corpus, suitable for Swedish speech recognition tasks.
Speech Recognition
Transformers Other

W
facebook
33
0
Wav2vec2 Base Sk Voxpopuli V2
Wav2Vec2 base model pretrained on Slovak data from the VoxPopuli corpus, suitable for speech recognition tasks.
Speech Recognition
Transformers Other

W
facebook
31
0
Mt5 Base
Apache-2.0
mT5 is a multilingual variant of the T5 model, pretrained on the mC4 corpus covering 101 languages, suitable for multilingual text processing tasks.
Large Language Model Supports Multiple Languages
M
google
118.49k
229
Wav2vec2 Base Et Voxpopuli V2
A speech model based on Facebook's Wav2Vec2 framework, specifically pretrained for Estonian
Speech Recognition
Transformers Other

W
facebook
30
0
Wav2vec2 Base Cs Voxpopuli V2
Wav2Vec2 base model pretrained on the VoxPopuli corpus, specialized for Czech speech processing
Speech Recognition
Transformers Other

W
facebook
33
1
Wav2vec2 Base It Voxpopuli
Wav2Vec2 base model pretrained on unlabeled Italian data from VoxPopuli, suitable for speech recognition tasks.
Speech Recognition
Transformers Other

W
facebook
32
0
Wav2vec2 Base De Voxpopuli V2
A German speech pretrained model based on Facebook's Wav2Vec2 architecture, pretrained using 23.2k unlabeled German data from the VoxPopuli corpus.
Speech Recognition
Transformers German

W
facebook
44
1
Wav2vec2 Base Nl Voxpopuli V2
A speech model based on Facebook's Wav2Vec2 architecture, specifically pretrained for Dutch using 19.0k unlabeled data from the VoxPopuli corpus.
Speech Recognition
Transformers Other

W
facebook
22
0
Wav2vec2 Large Nl Voxpopuli
Automatic speech recognition model pre-trained on the Dutch subset of the VoxPopuli corpus
Speech Recognition Other
W
facebook
18
0
Wav2vec2 Large It Voxpopuli
A speech recognition model pre-trained on unlabeled Italian data from VoxPopuli, using Facebook's Wav2Vec2 architecture
Speech Recognition Other
W
facebook
55
0
Wav2vec2 Base Lt Voxpopuli V2
This is a speech model based on Facebook's Wav2Vec2 architecture, specifically pretrained for Lithuanian using 14.4k unlabeled data from the VoxPopuli corpus.
Speech Recognition
Transformers Other

W
facebook
31
0
Wav2vec2 Base Hu Voxpopuli V2
A speech pretraining model based on Facebook's Wav2Vec2 architecture, pretrained on Hungarian data from the VoxPopuli corpus
Speech Recognition
Transformers Other

W
facebook
30
0
Wav2vec2 Base Bg Voxpopuli V2
A speech model based on Facebook's Wav2Vec2 architecture, specifically pretrained for Bulgarian language, suitable for speech recognition tasks.
Speech Recognition
Transformers Other

W
facebook
30
0
Wav2vec2 Base Lv Voxpopuli V2
A foundational speech recognition model based on Facebook's Wav2Vec2 architecture, specifically pretrained for Latvian (lv) using 13.1k hours of unlabeled data from the VoxPopuli corpus.
Speech Recognition
Transformers Other

W
facebook
29
1
Wav2vec2 Base Fr Voxpopuli
Wav2Vec2 base model pre-trained on unannotated French data from VoxPopuli, suitable for French speech recognition tasks
Speech Recognition
Transformers French

W
facebook
30
0
Mt5 Xxl
Apache-2.0
mT5 is a multilingual text-to-text transformation model launched by Google, supporting 101 languages, pretrained on the mC4 dataset, and suitable for various NLP tasks.
Large Language Model
Transformers Supports Multiple Languages

M
google
7,532
68
Wav2vec2 Base 100k Voxpopuli
A speech recognition base model pretrained on 100,000 hours of unannotated data from the VoxPopuli corpus
Speech Recognition
Transformers Other

W
facebook
148
4
Wav2vec2 Base Es Voxpopuli V2
Wav2Vec2 base model, pretrained on 21.4k hours of unlabeled Spanish data, suitable for speech recognition tasks.
Speech Recognition
Transformers Spanish

W
facebook
46
1
Wav2vec2 Large West Germanic Voxpopuli V2
Facebook's Wav2Vec2 large model, pretrained exclusively on 66.3 hours of unlabeled data from the West Germanic VoxPopuli corpus.
Speech Recognition
Transformers

W
facebook
25
1
Wav2vec2 Large El Voxpopuli V2
Greek speech recognition model pretrained on VoxPopuli corpus using 17.7 hours of unlabeled data
Speech Recognition
Transformers Other

W
facebook
24
0
Mt5 Xl
Apache-2.0
mT5 is the multilingual version of the T5 model, supporting 101 languages, pretrained on the mC4 corpus, and suitable for various natural language processing tasks.
Large Language Model Supports Multiple Languages
M
google
3,104
24
Legal T5 Small Multitask Sv En
A multitask learning model for Swedish legal text to English translation, combining supervised translation tasks and unsupervised masked language model tasks
Machine Translation
L
SEBIS
17
0
Legal T5 Small Trans Cs En Small Finetuned
Small T5 model for Czech legal text to English translation, based on 60 million parameter architecture
Machine Translation
L
SEBIS
18
0
Featured Recommended AI Models