S

Sharegpt4v 7B

Developed by Lin-Chen
ShareGPT4V-7B is an open-source multimodal chatbot model trained using GPT4-Vision-assisted data and LLaVA instruction fine-tuning data.
Downloads 530
Release Time : 11/20/2023

Model Overview

This model combines the CLP vision tower with LLaMA/Vicuna, focusing on research in large multimodal models and chatbot applications.

Model Features

Multimodal Understanding
Capable of processing both image and text inputs to understand visual and textual content.
High-Quality Training Data
Utilizes 1.2 million high-quality image-text pairs and 100,000 GPT4-Vision-generated image-text pairs.
Open-Source and Extensible
Fully open-source and can be seamlessly loaded within the LLaVA codebase.

Model Capabilities

Image Understanding
Multimodal Dialogue
Image-Text Generation
Visual Question Answering

Use Cases

Research
Multimodal Model Research
Used for research at the intersection of computer vision and natural language processing.
Application Development
Intelligent Chatbot
Develop dialogue systems capable of understanding image content.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase