Y

Yi VL 34B

Developed by 01-ai
Yi-VL-34B is an open-source multimodal model from the Yi series, capable of understanding image content and engaging in multi-turn conversations, with outstanding performance on the MMMU and CMMMU benchmarks.
Downloads 150
Release Time : 12/25/2023

Model Overview

Yi-VL is the multimodal version of the Yi large language model series, supporting both Chinese and English, capable of understanding and analyzing image content, performing visual question answering and multi-turn dialogues.

Model Features

Bilingual Multimodal Support
Supports bilingual dialogues in Chinese and English, including text recognition within images.
High-Resolution Image Understanding
Supports image understanding at 448×448 resolution, capable of processing finer visual details.
Multi-Turn Image-Text Dialogue
Can accept both text and images as input for multi-turn visual question answering.
Powerful Image Analysis Capabilities
Excels at extracting, organizing, and summarizing information from images.

Model Capabilities

Image content understanding
Visual question answering
Multi-turn dialogue
Bilingual processing (Chinese and English)
Image text recognition

Use Cases

Education
Multidisciplinary Visual Question Answering
Helps students understand complex diagrams and image content
Outstanding performance on the MMMU and CMMMU multidisciplinary benchmarks
Content Analysis
Image Content Summarization
Extracts key information from images and generates descriptions
Accurately identifies and describes objects and scenes within images
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase