M

Metaclip B16 400m

Developed by facebook
MetaCLIP is a vision-language model trained on CommonCrawl data for constructing shared image-text embedding spaces
Downloads 51
Release Time : 10/9/2023

Model Overview

This model applies the MetaCLIP framework to 400 million data points from CommonCrawl to reveal CLIP training data selection methods, supporting cross-modal understanding between images and text

Model Features

Public data training
Trained on the open CommonCrawl dataset with high data transparency
Cross-modal understanding
Can process both visual and textual information to establish shared embedding spaces
Zero-shot learning
Capable of performing new tasks without task-specific training

Model Capabilities

Zero-shot image classification
Text-based image retrieval
Image-based text retrieval
Cross-modal feature extraction

Use Cases

Content retrieval
Image search engine
Retrieve relevant images using natural language descriptions
Intelligent labeling
Automatic image tagging
Generate descriptive labels for unlabeled images
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase