Model Selection

Multimodal vision-language model

# Multimodal vision-language model

BioTrove-CLIP is a set of CLIP-style visual-language foundation models for biodiversity, trained on a dataset containing 40 million images and 33,000 plant and animal species.

Text-to-Image English

Qwen For Jawi V1

A Jawi OCR model fine-tuned from Qwen2-VL-2B-Instruct, specifically designed for recognizing historical Malay texts

culturalheritagenus

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase