D

Dinov2 Large

Developed by facebook
A vision Transformer model trained using the DINOv2 method, extracting robust visual features from massive image data through self-supervised learning
Downloads 558.78k
Release Time : 7/17/2023

Model Overview

This model adopts a Transformer encoder architecture and is pre-trained in a self-supervised manner, capable of learning intrinsic image representations suitable for feature extraction in various computer vision downstream tasks

Model Features

Self-supervised learning
Learns features from massive image data without human annotations through self-supervision
Robust visual features
Capable of extracting general visual features applicable to multiple downstream tasks
Transformer architecture
Based on advanced Transformer encoder structure for effective image data processing

Model Capabilities

Image feature extraction
Visual representation learning
Foundation model for computer vision tasks

Use Cases

Computer Vision
Image classification
Fine-tune by adding a classification head on top of the pre-trained model
Object detection
Used as a feature extractor for object detection tasks
Image similarity calculation
Compute image similarity using extracted feature vectors
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase