N

Nllb Clip Large Oc

Developed by visheratin
NLLB-CLIP is a multilingual vision-language model combining the NLLB model's text encoder with CLIP's image encoder, supporting 201 languages.
Downloads 28
Release Time : 10/7/2023

Model Overview

This model integrates NLLB's text encoding capabilities with CLIP's image encoding capabilities, extending support to 201 languages from Flores-200, with particularly outstanding performance on low-resource languages.

Model Features

Multilingual support
Supports 201 languages from Flores-200, with particularly outstanding performance on low-resource languages.
Cross-modal capability
Combines text and image encoding capabilities to achieve zero-shot image classification.
High performance
Sets new technical benchmarks on the Crossmodal-3600 dataset.

Model Capabilities

Zero-shot image classification
Multilingual text understanding
Cross-modal retrieval

Use Cases

Multilingual image classification
Multilingual image labeling
Classify and label images using supported languages.
Outstanding performance on low-resource languages.
Cross-modal retrieval
Image-text matching
Match images with text descriptions in a multilingual environment.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase