X

Xinyuan VL 2B

Developed by Cylingo
Xinyuan-VL-2B is a high-performance multimodal large model for edge-side applications launched by Cylingo Group, fine-tuned based on Qwen/Qwen2-VL-2B-Instruct, utilizing over 5 million multimodal data points and a small amount of pure text data.
Downloads 94
Release Time : 9/24/2024

Model Overview

Xinyuan-VL-2B is a high-performance multimodal large model focused on visual question answering tasks, supporting both Chinese and English, suitable for edge-side applications.

Model Features

High-Performance Multimodal
Outperforms open-source models of the same scale in multiple authoritative benchmarks.
Edge-Side Optimization
Designed specifically for edge-side applications, suitable for deployment on resource-constrained devices.
Bilingual Support (Chinese-English)
Supports multimodal understanding and generation tasks in both Chinese and English.

Model Capabilities

Visual Question Answering
Image Caption Generation
Multimodal Understanding
Text Generation

Use Cases

Intelligent Customer Service
Image Question Answering
Users upload images and ask questions, and the model generates accurate answers.
Achieved an accuracy of 74.3 on the MMB-CN-V11 beta test.
Education
Chart Understanding
Helps students understand complex charts and image content.
Achieved an accuracy of 74.2 on the AI2D chart understanding test.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase