L

Levit 256

Developed by facebook
LeViT-256 is an efficient vision model based on Transformer architecture, designed for fast inference and pretrained on the ImageNet-1k dataset.
Downloads 37
Release Time : 6/1/2022

Model Overview

LeViT is a vision model that combines the advantages of convolutional neural networks and Transformers, suitable for image classification tasks with efficient inference speed.

Model Features

Efficient Inference
Achieves faster inference speed than pure Transformer models by combining the strengths of CNN and Transformer.
Hybrid Architecture
Innovatively combines convolutional neural networks with Transformers, featuring both local and global feature extraction capabilities.
Teacher-Student Training
Uses a teacher model to guide the training process, improving model performance.

Model Capabilities

Image Classification
Visual Feature Extraction

Use Cases

Computer Vision
Object Recognition
Identify the category of objects in images
Can accurately classify 1,000 categories in ImageNet-1k.
Scene Understanding
Analyze the content of image scenes
Can recognize complex scenes such as palaces.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase