V

Vit Bigg 14 CLIPA Datacomp1b

Developed by UCSC-VLAA
CLIPA-v2 model, focusing on zero-shot image classification tasks, achieving efficient visual representation learning through contrastive image-text training
Downloads 623
Release Time : 10/20/2023

Model Overview

This is a contrastive image-text model based on the CLIPA-v2 architecture, specifically designed for zero-shot image classification tasks. By training on large-scale datasets, it can understand the relationship between images and text, enabling classification capabilities without specific training.

Model Features

Efficient Zero-shot Learning
Achieves image classification without task-specific training
Low-Cost High Performance
Achieves 81.1% zero-shot ImageNet accuracy with relatively low training cost
Inverse Scaling Law
Adopts CLIPA's inverse scaling law to optimize the balance between model performance and computational resources

Model Capabilities

Zero-shot image classification
Image-text contrastive learning
Cross-modal representation learning

Use Cases

Computer Vision
Image Classification
Classifies arbitrary images without specific training
Achieves 81.1% zero-shot accuracy on ImageNet
Image-Text Retrieval
Retrieves relevant images based on text descriptions or vice versa
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase