Fare2 Clip
F
Fare2 Clip
Developed by chs20
Vision-language model initialized with OpenAI CLIP, enhanced with unsupervised adversarial fine-tuning for improved robustness
Downloads 543
Release Time : 2/23/2024
Model Overview
This model is a vision-language model based on the CLIP architecture, trained on the ImageNet dataset using unsupervised adversarial fine-tuning, specifically enhancing robustness against adversarial attacks.
Model Features
Adversarial Robustness
Utilizes unsupervised adversarial fine-tuning with infinity norm and radius 2/255, enhancing the model's resistance to adversarial attacks
Zero-shot Capability
Retains CLIP's zero-shot classification ability, applicable to new tasks without task-specific fine-tuning
Vision-Language Alignment
Maintains CLIP's alignment of visual and language representations, supporting cross-modal tasks
Model Capabilities
Zero-shot image classification
Cross-modal retrieval
Adversarial robustness analysis
Use Cases
Computer Vision
Robust Image Classification
Reliable image classification under adversarial attack conditions
Demonstrates stronger adversarial robustness on ImageNet
Cross-modal Retrieval
Cross-modal search between images and text
Security Applications
Adversarial Attack Detection
Identifying inputs that may contain adversarial perturbations
Featured Recommended AI Models
Š 2025AIbase