Model Selection

Multimodal Evaluation

# Multimodal Evaluation

Tinyllava Video R1

TinyLLaVA-Video-R1 is a small-scale video reasoning model based on the traceable training model TinyLLaVA-Video. It significantly enhances reasoning and thinking abilities through reinforcement learning and exhibits the emergent property of 'epiphany moments'.

Llava Critic 7b Hf

This is a transformers-compatible vision-language model with image understanding and text generation capabilities

Uiclip Jitteredwebsites 2 224 Paraphrased

UIClip is a multimodal model that quantifies the design quality and relevance of user interface (UI) screenshots through textual descriptions.

ChartVE is a visual entailment model designed to evaluate the factual accuracy of generated caption sentences relative to input charts.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase