Metaclip B16 Fullcc2.5b
MetaCLIP is an implementation of the CLIP framework applied to CommonCrawl data, aiming to reveal CLIP's training data filtering methods
Downloads 90.78k
Release Time : 10/9/2023
Model Overview
This model constructs a shared image-text embedding space, supporting tasks such as zero-shot image classification and text-based image retrieval
Model Features
Data Transparency
First public disclosure of data preprocessing pipeline for CLIP-style models
Large-scale Training
Trained on 2.5 billion data points from CommonCrawl
Multimodal Capability
Simultaneously processes visual and textual information
Model Capabilities
Zero-shot image classification
Text-based image retrieval
Image-based text retrieval
Cross-modal embedding learning
Use Cases
Content Retrieval
Music Scene Recognition
Identifies music-related scene elements in images
Can distinguish scene labels like 'playing music' and 'doing sports'
Multimodal Applications
Image-Text Matching System
Builds association systems between images and descriptive texts
Featured Recommended AI Models