X

Xgen Mm Phi3 Mini Instruct Singleimg R V1.5

Developed by Salesforce
xGen-MM is a series of the latest foundational large multimodal models developed by Salesforce AI Research. It is improved based on the successful design of the BLIP series, providing more powerful multimodal processing capabilities.
Downloads 313
Release Time : 8/7/2024

Model Overview

The xGen-MM series of models have been trained on a large scale on high-quality image caption datasets and interleaved image-text data, suitable for multimodal task processing.

Model Features

Advanced Architecture
Based on the successful design of the BLIP series, with basic enhancements to provide more powerful multimodal processing capabilities.
Large-Scale Training
Trained on a large scale on high-quality image caption datasets and interleaved image-text data.
Multiple Model Options
Provides a variety of different types of models to meet different application requirements.

Model Capabilities

Image Understanding
Text Generation
Multimodal Reasoning
Visual Question Answering

Use Cases

Visual Question Answering
Image Content Understanding
Describe and answer questions about image content
Achieved a score of 72.2 in the SEED-IMG benchmark test
Multimodal Reasoning
Cross-Modal Understanding
Conduct reasoning by combining image and text information
Achieved a score of 76.8 in the MMB (development set) benchmark test
Featured Recommended AI Models
ยฉ 2025AIbase