Fare4 Clip
F
Fare4 Clip
Developed by chs20
Vision-language model initialized with OpenAI CLIP, enhanced robustness through unsupervised adversarial fine-tuning
Downloads 45
Release Time : 2/23/2024
Model Overview
This model is a vision-language model based on the CLIP architecture, which has been enhanced for robustness through unsupervised adversarial fine-tuning on the ImageNet dataset using L-infinity norm and a radius of 4/255.
Model Features
Unsupervised Adversarial Fine-tuning
Adversarial training on ImageNet using L-infinity norm with a radius of 4/255 to enhance model robustness
Based on CLIP Architecture
Inherits CLIP's powerful vision-language alignment capabilities
Enhanced Robustness
Specifically optimized for adversarial attack scenarios to improve model stability
Model Capabilities
Zero-shot Image Classification
Image-Text Matching
Cross-modal Retrieval
Use Cases
Computer Vision
Robust Image Classification
Reliable image classification in adversarial attack environments
Exhibits stronger adversarial robustness compared to standard CLIP models
Cross-modal Retrieval
Mutual retrieval between images and text under adversarial conditions
Featured Recommended AI Models
Š 2025AIbase