Pllava 7b
PLLaVA is an open-source video language chatbot, obtained by fine-tuning a large image language model on video instruction following data, which can be used for the research of multimodal large models and chatbots.
Downloads 109
Release Time : 4/24/2024
Model Overview
PLLaVA is an autoregressive language model based on the Transformer architecture, trained by fine-tuning a large image language model on video instruction following data, mainly used for the research of large multimodal models and chatbots.
Model Features
Video language understanding
Capable of understanding and processing language instructions related to video content
Multimodal ability
Combines visual and language modalities for understanding and generation
Open-source research tool
Provides an open-source foundation for multimodal large model research
Model Capabilities
Video content understanding
Multimodal dialogue
Instruction following
Visual question answering
Use Cases
Academic research
Multimodal model research
Used to explore the multimodal model architecture combining video and language
Chatbot development
Serves as the basic model for video dialogue chatbots
Application development
Video content analysis
Automatically analyzes video content and generates descriptions
Featured Recommended AI Models