Model Selection

Low-token multimodal

# Low-token multimodal

Videochat Flash Qwen2 7B Res448

VideoChat-Flash-7B is a multimodal model built upon UMT-L (300M) and Qwen2-7B, using only 16 tokens per frame and supporting input sequences of up to approximately 10,000 frames.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase