Levit-384 Open-source Vision Model - Integrating the Advantages of Convolution to Achieve Fast Image Inference

Levit 384

Developed by facebook

LeViT-384 is a vision Transformer model pre-trained on the ImageNet-1k dataset, combining the advantages of convolutional networks for faster inference speed.

Image Classification

Transformers

Open Source License:Apache-2.0 #Efficient Image Classification #Lightweight Transformer #Fast Inference

Downloads 37

Release Time : 6/1/2022

Model Overview

The LeViT model is a vision model that combines convolutional networks and Transformer architecture, specifically designed for image classification tasks. It optimizes inference speed while maintaining high accuracy.

Model Features

Efficient Inference

Combines the advantages of convolutional networks to optimize the inference speed of traditional vision Transformers

High Accuracy

Trained on the ImageNet-1k dataset, it has excellent image classification capabilities

Teacher-Student Architecture

Uses a teacher-student training approach to enhance model performance

Model Capabilities

Image Classification

Visual Feature Extraction

Use Cases

Computer Vision

Object Recognition

Identifies objects in images and classifies them into 1000 ImageNet categories

Accurately recognizes common objects such as animals, everyday items, etc.

Scene Understanding

Analyzes the content of image scenes

Can identify scene types such as buildings, natural landscapes, etc.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Levit 384

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 LeViT

🚀 Quick Start

💻 Usage Examples

Basic Usage

📄 License