M

Mexma Siglip2

Developed by visheratin
MEXMA-SigLIP2 is a high-performance CLIP model combining the MEXMA multilingual text encoder and SigLIP2 image encoder, supporting 80 languages.
Downloads 224
Release Time : 3/2/2025

Model Overview

This model integrates the MEXMA multilingual text encoder and SigLIP2 image encoder to achieve cross-modal retrieval capabilities, excelling particularly in zero-shot image classification tasks.

Model Features

Multilingual support
Supports 80 languages, including various Asian, European, and African languages
High-performance cross-modal retrieval
Achieves new state-of-the-art results on the Crossmodal-3600 dataset
Zero-shot learning capability
Performs image classification tasks without task-specific fine-tuning

Model Capabilities

Zero-shot image classification
Cross-modal retrieval
Multilingual text understanding
Image-text matching

Use Cases

Image retrieval
Multilingual image search
Retrieve relevant images using queries in different languages
Achieves 62.54% image retrieval accuracy on the Crossmodal-3600 dataset
Text retrieval
Image-related text retrieval
Retrieve relevant text descriptions based on image content
Achieves 59.99% text retrieval accuracy on the Crossmodal-3600 dataset
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase