D

Deit Base Patch16 384

Developed by facebook
DeiT is an efficiently trained Vision Transformer model, pre-trained and fine-tuned on the ImageNet-1k dataset at 384x384 resolution, suitable for image classification tasks.
Downloads 442
Release Time : 3/2/2022

Model Overview

This model is a more efficiently trained Vision Transformer (ViT), optimizing the training process through attention mechanisms and distillation techniques, primarily used for image classification tasks.

Model Features

Efficient Training
Optimizes the training process through attention mechanisms and distillation techniques, reducing data requirements.
High Resolution Support
Supports 384x384 resolution input, improving classification accuracy.
Lightweight Architecture
The base model has 86M parameters, suitable for medium-scale deployment.

Model Capabilities

Image Classification
Feature Extraction

Use Cases

Computer Vision
ImageNet Classification
Classifies images into one of the 1000 ImageNet categories.
Top-1 accuracy 82.9%, Top-5 accuracy 96.2%.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase