W

Webssl Dino300m Full2b 224

Developed by facebook
A 224-resolution Vision Transformer model based on 2 billion MetaCLIP data, trained using DINOv2 self-supervised learning method
Downloads 503
Release Time : 4/25/2025

Model Overview

This is a 300-million-parameter Vision Transformer model trained via self-supervised learning on 2 billion web images without language supervision, suitable for various visual tasks.

Model Features

Large-scale self-supervised learning
Trained on 2 billion web images without any language supervision
High-performance visual representation
Performance comparable to or surpassing language-supervised models across various visual tasks
High-resolution processing
Supports 224×224 pixel resolution input

Model Capabilities

Image feature extraction
Visual representation learning
Image classification
Object detection

Use Cases

Computer vision
Image classification
Perform image classification tasks using features extracted by the model
Object detection
Achieve efficient object detection by combining with detection heads
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase