LeViT-256 Open-Source Vision Model: An Efficient Image Recognition Assistant Designed for Fast Inference

Levit 256

Developed by facebook

LeViT-256 is an efficient vision model based on Transformer architecture, designed for fast inference and pretrained on the ImageNet-1k dataset.

Image Classification

Transformers

Open Source License:Apache-2.0 #Efficient Vision Transformer #Fast Image Classification #Lightweight Architecture

Downloads 37

Release Time : 6/1/2022

Model Overview

LeViT is a vision model that combines the advantages of convolutional neural networks and Transformers, suitable for image classification tasks with efficient inference speed.

Model Features

Efficient Inference

Achieves faster inference speed than pure Transformer models by combining the strengths of CNN and Transformer.

Hybrid Architecture

Innovatively combines convolutional neural networks with Transformers, featuring both local and global feature extraction capabilities.

Teacher-Student Training

Uses a teacher model to guide the training process, improving model performance.

Model Capabilities

Image Classification

Visual Feature Extraction

Use Cases

Computer Vision

Object Recognition

Identify the category of objects in images

Can accurately classify 1,000 categories in ImageNet-1k.

Scene Understanding

Analyze the content of image scenes

Can recognize complex scenes such as palaces.

Property	Details
Model Type	LeViT-256 for image classification
Training Data	ImageNet-1k

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Levit 256

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 LeViT

🚀 Quick Start

💻 Usage Examples

Basic Usage

📄 License