S

Siglip Base Patch16 384

Developed by google
SigLIP is a multimodal model pre-trained on the WebLi dataset, employing an improved sigmoid loss function, suitable for zero-shot image classification and image-text retrieval tasks.
Downloads 2,570
Release Time : 1/8/2024

Model Overview

SigLIP is an improved loss function version of the CLIP multimodal model, where the sigmoid loss function only operates on image-text pairs without requiring normalization through global similarity. Suitable for tasks such as zero-shot image classification and image-text retrieval.

Model Features

Improved Loss Function
Uses a sigmoid loss function that only operates on image-text pairs without requiring normalization through global similarity, enabling the model to perform well in both large and small batch scenarios.
Efficient Training
Training can be completed in just three days on 16 TPU-v4 chips.
High-Resolution Support
Supports image inputs with a resolution of 384x384.

Model Capabilities

Zero-shot image classification
Image-text retrieval

Use Cases

Image Classification
Animal Recognition
Identify the type of animal in an image, such as cats, dogs, etc.
Can accurately identify the type of animal in an image.
Image-Text Retrieval
Image Search
Search for relevant images based on text descriptions.
Can efficiently retrieve relevant images based on text descriptions.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase