L

Llm Jp Clip Vit Large Patch14

Developed by llm-jp
A Japanese CLIP model trained based on the OpenCLIP framework, trained on a dataset of 1.45 billion Japanese image-text pairs, supporting zero-shot image classification and image-text retrieval tasks
Downloads 254
Release Time : 12/27/2024

Model Overview

This is a Japanese vision-language model that can map images and Japanese text to a shared embedding space, enabling zero-shot image classification and cross-modal retrieval functions

Model Features

Large-scale Japanese training data
Trained using a dataset of 1.5 billion Japanese image-text pairs obtained through high-quality machine translation
High-performance vision-language understanding
Performs excellently in multiple benchmark tests, especially in tasks related to Japanese culture
Zero-shot classification ability
Can perform image classification tasks without fine-tuning for specific tasks

Model Capabilities

Zero-shot image classification
Image-text similarity calculation
Cross-modal retrieval
Image semantic understanding

Use Cases

Content moderation
Inappropriate content detection
Detect inappropriate content in images through text descriptions
E-commerce
Product search
Find relevant product images through natural language descriptions
Media analysis
Image annotation
Automatically generate Japanese description tags for images
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase