D

Dino Vits8

Developed by facebook
A Vision Transformer model trained with self-supervised DINO method using 8x8 image patches, suitable for image feature extraction tasks
Downloads 106.97k
Release Time : 3/2/2022

Model Overview

This Vision Transformer model is pretrained on ImageNet-1k dataset using DINO self-supervised method, capable of learning intrinsic image representations for downstream computer vision tasks

Model Features

Self-supervised learning
Trained with DINO self-supervised method, requiring no manual annotation
8x8 patch processing
Processes images by dividing them into 8x8 pixel patches, suitable for capturing local features
Transformer architecture
Based on Transformer encoder architecture with powerful feature extraction capabilities

Model Capabilities

Image feature extraction
Image representation learning
Foundation model for computer vision tasks

Use Cases

Computer vision
Image classification
Can serve as a foundation model by adding classification heads for image classification tasks
Object detection
Extracted image features can be used for object detection tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase