Open-source Coin Image Retrieval Model coin-clip-vit-base-patch32 - Enhance Feature Extraction to Easily Find Coins

Home

Coin Clip Vit Base Patch32

Developed by breezedeus

A coin image retrieval model fine-tuned based on CLIP, enhancing feature extraction capabilities for coin images

Image-to-Text

Transformers

Open Source License:Apache-2.0 #Coin Image Retrieval #Multimodal Feature Extraction #Numismatics Specialized

Downloads 886

Release Time : 11/26/2023

Model Overview

Coin-CLIP is a specialized model fine-tuned from OpenAI's CLIP model, focusing on coin image retrieval and recognition. Trained on a large dataset of coin images using contrastive learning techniques, it significantly improves the accuracy of coin image searches.

Model Features

Specialized Coin Image Retrieval

Optimized feature extraction for coin images, significantly improving retrieval accuracy

Multimodal Learning Capability

Inherits CLIP's multimodal learning capability, supporting image-text association

Large-Scale Training Data

Fine-tuned on a dataset of over 340,000 coin images

Model Capabilities

Coin Image Feature Extraction

Image-Based Coin Search

Coin Recognition

Use Cases

Numismatics

Coin Image Database Retrieval

Quickly find similar coins in large coin image libraries

Significantly improved retrieval accuracy compared to the generic CLIP model

Coin Identification Assistance

Aids in identifying rare or historical coins

🚀 Coin-CLIP 🪙 : Enhancing Coin Image Retrieval with CLIP

Coin-CLIP is a specialized model that enhances coin image retrieval. It combines the power of Visual Transformer (ViT) and CLIP's multimodal learning, fine - tuned on a large coin image dataset to improve feature extraction and achieve more accurate image - based search.

✨ Features

State - of - the - art coin image retrieval;
Enhanced feature extraction for numismatic images;
Seamless integration with CLIP's multimodal learning.

📚 Documentation

Model Details

This model (Coin - CLIP) is built upon OpenAI's [CLIP](https://huggingface.co/openai/clip - vit - base - patch32) (ViT - B/32) model and fine - tuned on a dataset of more than 340,000 coin images using contrastive learning techniques. This specialized model is designed to significantly improve feature extraction for coin images, leading to more accurate image - based search capabilities. Coin - CLIP combines the power of Visual Transformer (ViT) with CLIP's multimodal learning capabilities, specifically tailored for the numismatic domain.

Comparison: Coin - CLIP vs. CLIP

Example 1 (Left: Coin - CLIP; Right: CLIP)

![1. Coin - CLIP vs. CLIP](https://www.notion.so/image/https%3A%2F%2Fprod - files - secure.s3.us - west - 2.amazonaws.com%2F9341931a - 53f0 - 48e1 - b026 - 0f1ad17b457c%2F4b047305 - 0bf2 - 4809 - acc6 - 94fd412d5307%2FUntitled.gif?table = block&id = 78225b2b - 49b4 - 4a18 - b33c - c4530a6e8330)

Example 2 (Left: Coin - CLIP; Right: CLIP)

![2. Coin - CLIP vs. CLIP](https://www.notion.so/image/https%3A%2F%2Fprod - files - secure.s3.us - west - 2.amazonaws.com%2F9341931a - 53f0 - 48e1 - b026 - 0f1ad17b457c%2F14376459 - bedd - 4d82 - a178 - fde391fd70d0%2FUntitled.gif?table = block&id = 99ed5179 - bcab - 4c58 - b6d8 - 1a77bffe79f7)

More examples can be found: [breezedeus/Coin - CLIP: Coin CLIP](https://github.com/breezedeus/Coin - CLIP).

Usage and Limitations

Usage: This model is primarily used for extracting representation vectors from coin images, enabling efficient and precise image - based searches in a coin image database.
Limitations: As the model is trained specifically on coin images, it may not perform well on non - coin images.

Base Model

The base model is [openai/clip - vit - base - patch32](https://huggingface.co/openai/clip - vit - base - patch32).

Training Data

The model was trained on a specialized coin image dataset. This dataset includes images of various currencies' coins.

Training Process

The model was fine - tuned on the OpenAI CLIP (ViT - B/32) pretrained model using a coin image dataset. The training process involved Contrastive Learning fine - tuning techniques and parameter settings.

Performance

This model demonstrates excellent performance in coin image retrieval tasks.

Feedback

Where to send questions or comments about the model.

Welcome to contact the author [Breezedeus](https://www.breezedeus.com/join - group).

💻 Usage Examples

Transformers

from PIL import Image
import requests

import torch.nn.functional as F
from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("breezedeus/coin-clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("breezedeus/coin-clip-vit-base-patch32")

image_fp = "path/to/coin_image.jpg"
image = Image.open(image_fp).convert("RGB")

inputs = processor(images=image, return_tensors="pt")
img_features = model.get_image_features(**inputs)
img_features = F.normalize(img_features, dim=1)

Tool

To further simplify the use of the Coin - CLIP model, we provide a simple Python library [breezedeus/Coin - CLIP: Coin CLIP](https://github.com/breezedeus/Coin - CLIP) for quickly building a coin image retrieval engine.

Install

pip install coin_clip

Extract Feature Vectors

from coin_clip import CoinClip

# Automatically download the model from Huggingface
model = CoinClip(model_name='breezedeus/coin-clip-vit-base-patch32')
images = ['examples/10_back.jpg', 'examples/16_back.jpg']
img_feats, success_ids = model.get_image_features(images)
print(img_feats.shape)  # --> (2, 512)

More Tools can be found: [breezedeus/Coin - CLIP: Coin CLIP](https://github.com/breezedeus/Coin - CLIP).

📄 License

The model is licensed under the apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご