# ViT architecture optimization
Sapiens Depth 1b Torchscript
Sapiens is a vision transformer series model pre-trained on 300 million 1024 x 1024 resolution human images, focusing on human-centric vision tasks.
3D Vision English
S
facebook
160
0
Vit Facial Expression Recognition
ViT-based facial expression recognition model, fine-tuned on FER2013, MMI, and AffectNet datasets, supporting seven emotion classifications
Face-related
Transformers

V
mo-thecreator
8,730
16
Featured Recommended AI Models