vit_so400m_patch16_siglip_gap_384.v2_webli Open-source Model

Vit So400m Patch16 Siglip Gap 384.v2 Webli

Developed by timm

A ViT image encoder based on SigLIP 2, utilizing global average pooling, with the attention pooling head removed, suitable for image feature extraction tasks.

Image Classification

Transformers

Open Source License:Apache-2.0 #Multimodal Visual Encoding #Global Average Pooling #High Semantic Understanding

Downloads 19

Release Time : 2/21/2025

Model Overview

This model is a SigLIP 2 ViT image encoder specifically designed for timm, primarily used for image feature extraction. It is trained on the Webli dataset and employs global average pooling (GAP) instead of an attention pooling head.

Model Features

SigLIP 2 Architecture

Utilizes the improved SigLIP 2 architecture, offering better semantic understanding, localization, and dense feature extraction capabilities

Global Average Pooling

Replaces the attention pooling head with global average pooling (GAP), simplifying the model structure

Large-scale Pretraining

Pretrained on the large-scale Webli dataset

Model Capabilities

Image Feature Extraction

Visual Semantic Understanding

Dense Feature Extraction

Use Cases

Computer Vision

Image Retrieval

Uses extracted image features for similar image retrieval

Visual Localization

Identifies and understands specific regions and objects in images

Multimodal Applications

Vision-Language Tasks

Serves as a visual encoder for joint vision-language tasks

Property	Details
Dataset	webli
Papers	- SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features: https://arxiv.org/abs/2502.14786 - Sigmoid Loss for Language Image Pre-Training: https://arxiv.org/abs/2303.15343

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Vit So400m Patch16 Siglip Gap 384.v2 Webli

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Model card for vit_so400m_patch16_siglip_gap_384.v2_webli

🚀 Quick Start

📚 Documentation

Model Details

Citation

📄 License