Vit Base Patch16 224.dino Mlxim
V
Vit Base Patch16 224.dino Mlxim
Developed by mlx-vision
An image classification model based on the Vision Transformer architecture, trained on the ImageNet-1k dataset using the DINO self-supervised method.
Downloads 43
Release Time : 4/6/2024
Model Overview
This model is a Vision Transformer specifically designed for image classification tasks. It is trained using the DINO self-supervised learning method, with only the backbone network trained and no classification head.
Model Features
Self-supervised learning
Uses the DINO method for self-supervised training, eliminating the need for large amounts of labeled data.
Attention mechanism visualization
Supports generating attention heatmaps to help understand the model's focus points.
Feature extraction
Can extract layer features before the classification head, suitable for transfer learning.
Model Capabilities
Image classification
Feature extraction
Attention visualization
Use Cases
Computer vision
Image classification
Classify and recognize input images
Visual feature extraction
Extract high-level feature representations of images for downstream tasks
Featured Recommended AI Models
Š 2025AIbase