L

Levit 128S

Developed by facebook
LeViT-128S is a vision Transformer model pretrained on the ImageNet-1k dataset, combining the advantages of convolutional networks for faster inference.
Downloads 3,198
Release Time : 6/1/2022

Model Overview

LeViT is a vision model that integrates convolutional networks and Transformer architectures, designed for image classification tasks, optimizing inference speed while maintaining high accuracy.

Model Features

Hybrid Architecture Design
Combines the strengths of convolutional networks and Transformers to optimize computational efficiency while maintaining performance on vision tasks.
Efficient Inference
Designed for fast inference, with lower computational overhead compared to pure Transformer architectures.
ImageNet Pretraining
Pretrained on the ImageNet-1k dataset, ready for direct use in thousand-class image classification tasks.

Model Capabilities

Image Classification
Visual Feature Extraction

Use Cases

Computer Vision
General Object Recognition
Identify common objects in images (e.g., animals, everyday items)
Can accurately classify 1,000 categories from ImageNet
Scene Understanding
Analyze image scene content (e.g., indoor/outdoor environments, building types)
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase