Levit 192
LeViT-192 is a vision model that combines convolutional neural networks and Transformer architecture, focusing on image classification tasks.
Downloads 23
Release Time : 6/1/2022
Model Overview
The LeViT-192 model is pre-trained on the ImageNet-1k dataset at 224x224 resolution, combining the efficiency of convolutional neural networks with the powerful feature extraction capabilities of Transformers.
Model Features
Efficient Inference
Combines convolutional neural networks and Transformer architecture for faster inference speed.
High-Accuracy Classification
Pre-trained on the ImageNet-1k dataset, capable of accurately classifying 1,000 categories.
Teacher-Student Architecture
Utilizes a teacher-student architecture for training to enhance model performance.
Model Capabilities
Image Classification
Visual Feature Extraction
Use Cases
Computer Vision
Object Recognition
Identifies object categories in images, such as animals, everyday items, etc.
Can accurately classify 1,000 categories in ImageNet-1k.
Scene Classification
Classifies scenes in images, such as indoor, outdoor, natural landscapes, etc.
Featured Recommended AI Models