# Attention Distillation
X2I
Apache-2.0
X2I is a multimodal diffusion Transformer model capable of converting various input modalities (text, images, videos, audio, speech) into image outputs.
Text-to-Image Other
X
OPPOer
435
7
Deit Small Patch16 224
Apache-2.0
DeiT is a more efficiently trained Vision Transformer model, pre-trained and fine-tuned on the ImageNet-1k dataset at 224x224 resolution, suitable for image classification tasks.
Image Classification
Transformers

D
facebook
24.53k
8
Featured Recommended AI Models