L

Llm Jp Clip Vit Base Patch16

Developed by llm-jp
Japanese CLIP model trained on OpenCLIP framework, supporting zero-shot image classification tasks
Downloads 40
Release Time : 12/17/2024

Model Overview

This is a Japanese vision-language model capable of associating images with Japanese text, particularly suitable for zero-shot image classification tasks. The model was trained on a dataset of 1.45 billion Japanese image-text pairs with a total of 248M parameters.

Model Features

Japanese-specific
A CLIP model specifically optimized for Japanese, excelling in Japanese text understanding
Large-scale training data
Trained on 1.45 billion Japanese image-text pairs, covering a wide range of visual concepts
Zero-shot capability
Capable of performing image classification for new categories without specific training

Model Capabilities

Zero-shot image classification
Image-text matching
Cross-modal retrieval

Use Cases

Image classification
Japanese-labeled image classification
Classify images using Japanese text labels
Achieved 54.2% accuracy on ImageNet Japanese classification task
Cross-modal retrieval
Image search
Retrieve relevant images using Japanese text queries
Achieved 73.6% accuracy on XM3600 image-to-text retrieval task
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase