AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multimodal Evaluation

# Multimodal Evaluation

Tinyllava Video R1
Apache-2.0
TinyLLaVA-Video-R1 is a small-scale video reasoning model based on the traceable training model TinyLLaVA-Video. It significantly enhances reasoning and thinking abilities through reinforcement learning and exhibits the emergent property of 'epiphany moments'.
Video-to-Text Transformers
T
Zhang199
123
2
Llava Critic 7b Hf
This is a transformers-compatible vision-language model with image understanding and text generation capabilities
Text-to-Image Transformers
L
FuryMartin
21
1
Uiclip Jitteredwebsites 2 224 Paraphrased
MIT
UIClip is a multimodal model that quantifies the design quality and relevance of user interface (UI) screenshots through textual descriptions.
Text-to-Image Transformers
U
biglab
9,739
1
Chartve
Apache-2.0
ChartVE is a visual entailment model designed to evaluate the factual accuracy of generated caption sentences relative to input charts.
Image-to-Text Transformers English
C
khhuang
38
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase