M

Metaclip B32 400m

Developed by facebook
The MetaCLIP base model is a vision-language model trained on CommonCrawl data for constructing shared image-text embedding spaces.
Downloads 135.37k
Release Time : 10/7/2023

Model Overview

This model applies MetaCLIP technology to 400 million data points, supporting tasks like zero-shot image classification and text-based image retrieval.

Model Features

Large-scale Data Training
Trained on 400 million data points from CommonCrawl, with strong generalization capabilities
Zero-shot Learning Capability
Capable of performing various vision tasks without task-specific fine-tuning
Shared Embedding Space
Constructs a unified representation space for images and text, supporting cross-modal retrieval

Model Capabilities

Zero-shot Image Classification
Text-based Image Retrieval
Image-based Text Retrieval
Cross-modal Representation Learning

Use Cases

Content Retrieval
Image Search Engine
Retrieve relevant images using natural language descriptions
Content Classification
Zero-shot Image Classification
Classify images of new categories without training
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase