AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multimodal vision-language model

# Multimodal vision-language model

Biotrove CLIP
MIT
BioTrove-CLIP is a set of CLIP-style visual-language foundation models for biodiversity, trained on a dataset containing 40 million images and 33,000 plant and animal species.
Text-to-Image English
B
BGLab
48
2
Qwen For Jawi V1
A Jawi OCR model fine-tuned from Qwen2-VL-2B-Instruct, specifically designed for recognizing historical Malay texts
Image-to-Text Transformers
Q
culturalheritagenus
155
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase