Pllava-7B Open-source Video Language Chatbot - Free for Multimodal and Chat Research

Pllava 7b

Developed by ermu2001

PLLaVA is an open-source video language chatbot, obtained by fine-tuning a large image language model on video instruction following data, which can be used for the research of multimodal large models and chatbots.

Text-to-Video

Transformers

Open Source License:Apache-2.0 #Video language interaction #Multimodal research #Instruction following fine-tuning

Downloads 109

Release Time : 4/24/2024

Model Overview

PLLaVA is an autoregressive language model based on the Transformer architecture, trained by fine-tuning a large image language model on video instruction following data, mainly used for the research of large multimodal models and chatbots.

Model Features

Video language understanding

Capable of understanding and processing language instructions related to video content

Multimodal ability

Combines visual and language modalities for understanding and generation

Open-source research tool

Provides an open-source foundation for multimodal large model research

Model Capabilities

Video content understanding

Multimodal dialogue

Instruction following

Visual question answering

Use Cases

Academic research

Multimodal model research

Used to explore the multimodal model architecture combining video and language

Chatbot development

Serves as the basic model for video dialogue chatbots

Application development

Video content analysis

Automatically analyzes video content and generates descriptions

Property	Details
Model Type	PLLaVA-7B is an open - source video - language chatbot trained by fine - tuning Image - LLM on video instruction - following data. It is an auto - regressive language model, based on the transformer architecture. Base LLM: llava - hf/llava - v1.6 - vicuna - 7b - hf
Model Date	PLLaVA-7B was trained in April 2024.
Paper or resources for more information	- github repo: https://github.com/magic - research/PLLaVA - project page: https://pllava.github.io/ - paper link: https://arxiv.org/abs/2404.16994

Property	Details
Primary intended uses	The primary use of PLLaVA is research on large multimodal models and chatbots.
Primary intended users	The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Pllava 7b

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 PLLaVA Model Card

📚 Documentation

Model details

License

Intended use

Training dataset

Evaluation dataset

📄 License