Nllb Clip Base Oc
N
Nllb Clip Base Oc
Developed by visheratin
NLLB-CLIP is a multilingual vision-language model combining the NLLB text encoder with the CLIP image encoder, supporting 201 languages
Downloads 371
Release Time : 10/7/2023
Model Overview
This model integrates the text encoding capabilities of NLLB with the image encoding capabilities of CLIP, extending multilingual vision-language understanding, particularly excelling in low-resource languages
Model Features
Multilingual support
Supports 201 languages from Flores-200, including many low-resource languages
Cross-modal understanding
Combines text and image encoding capabilities to achieve vision-language alignment
Low-resource language optimization
Achieves state-of-the-art results on low-resource languages
Model Capabilities
Multilingual image classification
Cross-modal retrieval
Zero-shot learning
Use Cases
Multilingual content understanding
Multilingual image captioning
Generates descriptive labels for images in multiple languages
Performs excellently on the Crossmodal-3600 dataset
Cross-language image search
Retrieves relevant images using queries in different languages
Low-resource language applications
Low-resource language image classification
Classifies images in low-resource language environments
Achieves SOTA performance on low-resource languages
Featured Recommended AI Models
Š 2025AIbase