A

Aimv2 Large Patch14 448

Developed by apple
AIMv2 is a series of vision models based on multimodal autoregressive objective pretraining, excelling in multiple benchmarks
Downloads 2,210
Release Time : 10/29/2024

Model Overview

AIMv2 employs multimodal autoregressive objectives for pretraining, demonstrating strong performance in vision tasks like image classification and object detection

Model Features

Multimodal Autoregressive Pretraining
Uses innovative multimodal autoregressive objectives for pretraining to enhance model comprehension
Outstanding Performance
Surpasses mainstream vision models like CLIP, SigLIP, and DINOv2 across multiple benchmarks
Large-Scale Scalability
Simple and direct pretraining method enables effective training scale expansion

Model Capabilities

Image feature extraction
Image classification
Multimodal understanding
Open-vocabulary object detection
Referring expression comprehension

Use Cases

Computer Vision
Image Classification
Performs image classification tasks on datasets like ImageNet
Achieves 87.9% accuracy on ImageNet-1k
Fine-Grained Classification
Handles domain-specific fine-grained image classification tasks
Achieves 96.6% accuracy on Stanford Cars
Medical Image Analysis
Processes medical image classification tasks
Achieves 94.1% accuracy on Camelyon17
Remote Sensing Image Processing
Satellite Image Classification
Handles satellite and aerial image classification tasks
Achieves 98.6% accuracy on EuroSAT
Featured Recommended AI Models
ยฉ 2025AIbase