FARE4 ViT B 32 Laion2b S34b B79k
F
FARE4 ViT B 32 Laion2b S34b B79k
Developed by chs20
A robust perceptual metric model based on CLIP, enhanced through adversarial fine-tuning for improved performance under attack
Downloads 45
Release Time : 8/14/2024
Model Overview
This vision-language model is built on the CLIP architecture and adversarially fine-tuned using the FARE method on the ImageNet dataset, specifically strengthening robustness against adversarial attacks. Primarily used for perceptual similarity tasks and zero-shot image classification.
Model Features
Adversarial Robustness
Utilizes FARE method for adversarial fine-tuning, maintaining high performance under L-infinity and L2 norm attacks
Perceptual Similarity Metric
Excels on the NIGHTS dataset, accurately assessing image perceptual similarity
Zero-shot Capability
Built on CLIP architecture, possessing strong zero-shot image classification abilities
Model Capabilities
Zero-shot image classification
Image perceptual similarity metric
Adversarial robustness evaluation
Use Cases
Computer Vision
Image Classification
Classify images in zero-shot settings
Maintains high accuracy under adversarial attacks
Image Similarity Assessment
Evaluate perceptual similarity between two images
Achieves 91.1 raw performance on NIGHTS dataset
Security Research
Adversarial Robustness Testing
Evaluate model stability under adversarial attacks
Maintains 71.8 performance under L-infinity attack (eps=4/255)
Featured Recommended AI Models
Š 2025AIbase