L

Levit 384

Developed by facebook
LeViT-384 is a vision Transformer model pre-trained on the ImageNet-1k dataset, combining the advantages of convolutional networks for faster inference speed.
Downloads 37
Release Time : 6/1/2022

Model Overview

The LeViT model is a vision model that combines convolutional networks and Transformer architecture, specifically designed for image classification tasks. It optimizes inference speed while maintaining high accuracy.

Model Features

Efficient Inference
Combines the advantages of convolutional networks to optimize the inference speed of traditional vision Transformers
High Accuracy
Trained on the ImageNet-1k dataset, it has excellent image classification capabilities
Teacher-Student Architecture
Uses a teacher-student training approach to enhance model performance

Model Capabilities

Image Classification
Visual Feature Extraction

Use Cases

Computer Vision
Object Recognition
Identifies objects in images and classifies them into 1000 ImageNet categories
Accurately recognizes common objects such as animals, everyday items, etc.
Scene Understanding
Analyzes the content of image scenes
Can identify scene types such as buildings, natural landscapes, etc.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase