B

Beit Large Patch16 384

Developed by microsoft
BEiT is a vision Transformer-based image classification model, pretrained in a self-supervised manner on ImageNet-21k and fine-tuned on ImageNet-1k.
Downloads 44
Release Time : 3/2/2022

Model Overview

The BEiT model is a Vision Transformer (ViT), pretrained in a self-supervised manner on a large number of images and fine-tuned for image classification tasks.

Model Features

Self-supervised Pretraining
Uses the ImageNet-21k dataset for self-supervised pretraining to learn intrinsic image representations.
High-resolution Fine-tuning
Fine-tuned on ImageNet-1k at 384x384 resolution to enhance classification performance.
Relative Position Embeddings
Uses relative position embeddings (similar to T5) instead of absolute position embeddings to enhance model flexibility.

Model Capabilities

Image Classification
Feature Extraction

Use Cases

Computer Vision
ImageNet Image Classification
Classifies images into one of the 1,000 ImageNet categories.
Performs excellently on the ImageNet dataset.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase