X

Xgen Mm Phi3 Mini Base R V1.5

Developed by Salesforce
xGen-MM is a series of the latest foundational large language models (LMMs) developed by Salesforce AI Research. It is improved on the basis of the BLIP series and incorporates enhanced features, with more powerful foundational capabilities.
Downloads 830
Release Time : 8/12/2024

Model Overview

The xGen-MM series of models have been trained on a large scale on high-quality image caption datasets and interleaved image-text data, supporting multimodal task processing.

Model Features

Multimodal context learning
It has powerful multimodal context learning capabilities and can handle complex interactions between images and text
High-performance benchmark testing
It performs excellently in multiple benchmark tests such as VQAv2, TextVQA, and OKVQA
Interleaved image-text processing
Specifically optimized interleaved image-text processing capabilities, suitable for complex multimodal scenarios

Model Capabilities

Image understanding
Text generation
Multimodal question answering
Image caption generation
Context learning

Use Cases

Visual question answering
Question answering about image content
Answer relevant questions based on the image content
Achieved a score of 66.9 in the VQAv2 benchmark test
Image caption generation
Automatic image description
Generate accurate descriptions for images
Achieved a score of 109.8 in the COCO benchmark test
Multimodal interaction
Complex scenario understanding
Handle complex scenarios containing multiple images and texts
Performs excellently in interleaved image-text tasks
Featured Recommended AI Models
ยฉ 2025AIbase