🚀 Coin-CLIP 🪙 : Enhancing Coin Image Retrieval with CLIP
Coin-CLIP is a specialized model that enhances coin image retrieval. It combines the power of Visual Transformer (ViT) and CLIP's multimodal learning, fine - tuned on a large coin image dataset to improve feature extraction and achieve more accurate image - based search.
✨ Features
- State - of - the - art coin image retrieval;
- Enhanced feature extraction for numismatic images;
- Seamless integration with CLIP's multimodal learning.
📚 Documentation
Model Details
This model (Coin - CLIP) is built upon OpenAI's [CLIP](https://huggingface.co/openai/clip - vit - base - patch32) (ViT - B/32) model and fine - tuned on a dataset of more than 340,000
coin images using contrastive learning techniques. This specialized model is designed to significantly improve feature extraction for coin images, leading to more accurate image - based search capabilities. Coin - CLIP combines the power of Visual Transformer (ViT) with CLIP's multimodal learning capabilities, specifically tailored for the numismatic domain.
Comparison: Coin - CLIP vs. CLIP
Example 1 (Left: Coin - CLIP; Right: CLIP)

Example 2 (Left: Coin - CLIP; Right: CLIP)

More examples can be found: [breezedeus/Coin - CLIP: Coin CLIP](https://github.com/breezedeus/Coin - CLIP).
Usage and Limitations
- Usage: This model is primarily used for extracting representation vectors from coin images, enabling efficient and precise image - based searches in a coin image database.
- Limitations: As the model is trained specifically on coin images, it may not perform well on non - coin images.
Base Model
The base model is [openai/clip - vit - base - patch32](https://huggingface.co/openai/clip - vit - base - patch32).
Training Data
The model was trained on a specialized coin image dataset. This dataset includes images of various currencies' coins.
Training Process
The model was fine - tuned on the OpenAI CLIP (ViT - B/32) pretrained model using a coin image dataset. The training process involved Contrastive Learning fine - tuning techniques and parameter settings.
Performance
This model demonstrates excellent performance in coin image retrieval tasks.
Feedback
Where to send questions or comments about the model.
Welcome to contact the author [Breezedeus](https://www.breezedeus.com/join - group).
💻 Usage Examples
Transformers
from PIL import Image
import requests
import torch.nn.functional as F
from transformers import CLIPProcessor, CLIPModel
model = CLIPModel.from_pretrained("breezedeus/coin-clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("breezedeus/coin-clip-vit-base-patch32")
image_fp = "path/to/coin_image.jpg"
image = Image.open(image_fp).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
img_features = model.get_image_features(**inputs)
img_features = F.normalize(img_features, dim=1)
Tool
To further simplify the use of the Coin - CLIP model, we provide a simple Python library [breezedeus/Coin - CLIP: Coin CLIP](https://github.com/breezedeus/Coin - CLIP) for quickly building a coin image retrieval engine.
Install
pip install coin_clip
Extract Feature Vectors
from coin_clip import CoinClip
model = CoinClip(model_name='breezedeus/coin-clip-vit-base-patch32')
images = ['examples/10_back.jpg', 'examples/16_back.jpg']
img_feats, success_ids = model.get_image_features(images)
print(img_feats.shape)
More Tools can be found: [breezedeus/Coin - CLIP: Coin CLIP](https://github.com/breezedeus/Coin - CLIP).
📄 License
The model is licensed under the apache - 2.0 license.