T

Tecoa2 Clip

Developed by chs20
A vision-language model initialized from OpenAI CLIP, adversarially fine-tuned on ImageNet with enhanced robustness features
Downloads 53
Release Time : 2/23/2024

Model Overview

This model is a vision-language model based on the CLIP architecture, enhanced with adversarial robustness through fine-tuning, suitable for tasks like zero-shot image classification

Model Features

Adversarial Robustness
Supervised adversarial fine-tuning on ImageNet using L∞ norm with a radius of 2/255, enhancing the model's robustness against adversarial examples
Zero-shot Capability
Retains CLIP's zero-shot learning ability, applicable to various vision tasks without task-specific fine-tuning
Vision-Language Alignment
Maintains CLIP's original vision-language alignment properties, capable of understanding semantic relationships between images and text

Model Capabilities

Zero-shot Image Classification
Cross-modal Retrieval
Adversarial Example Recognition

Use Cases

Computer Vision
Robust Image Classification
Accurate image classification in the presence of adversarial perturbations
More stable performance on adversarial examples compared to standard CLIP models
Cross-modal Search
Retrieve relevant images based on text descriptions or generate descriptive texts from images
Security Applications
Adversarial Example Detection
Identify potentially adversarially modified image inputs
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase