L

Llavaction 0.5B

Developed by MLAdaptiveIntelligence
LLaVAction is a multimodal large language model for action recognition, based on the Qwen2 language model, trained on the EPIC-KITCHENS-100-MQA dataset.
Downloads 215
Release Time : 3/24/2025

Model Overview

This model focuses on video action recognition tasks, capable of understanding action content in first-person perspective videos, suitable for analyzing video content similar to EPIC-KITCHENS-100.

Model Features

Multimodal understanding capability
Combines visual and linguistic information to understand video content and generate relevant descriptions
First-person perspective action recognition
Specifically designed to recognize hand-object interaction actions in first-person perspective videos
Large context window
Supports a 32K token context window, suitable for processing long video content

Model Capabilities

Video content understanding
Action recognition
Multimodal question answering
Video frame analysis
Temporal information processing

Use Cases

Smart home
Kitchen activity analysis
Identifies various operational activities of users in the kitchen
Can accurately recognize common kitchen actions such as chopping and cooking
Behavioral research
Daily activity analysis
Studies human daily activity patterns and behavioral habits
Featured Recommended AI Models
ยฉ 2025AIbase