Mobilevlm 1.7B
M
Mobilevlm 1.7B
Developed by mtgv
MobileVLM is a lightweight multi-modal vision-language model designed specifically for mobile devices, supporting efficient image understanding and text generation tasks.
Downloads 647
Release Time : 12/31/2023
Model Overview
MobileVLM is a multi-modal vision-language model optimized for mobile devices, combining efficient vision and language processing capabilities, suitable for real-time interaction scenarios on mobile devices.
Model Features
Optimized for mobile devices
Designed specifically for mobile devices, supporting efficient CPU and GPU inference.
Multi-modal interaction
Achieves cross-modal interaction between vision and language modalities through an efficient projector.
High-performance inference
Reaches inference speeds of 21.5 and 65.3 tokens per second on the Qualcomm Snapdragon 888 CPU and NVIDIA Jetson Orin GPU respectively.
Lightweight architecture
A lightweight language model with 1.4 billion and 2.7 billion parameters, suitable for mobile deployment.
Model Capabilities
Image understanding
Text generation
Multi-modal interaction
Real-time inference on mobile devices
Use Cases
Mobile applications
Real-time image description
Generate real-time image descriptions on mobile devices.
Efficient and low-latency image understanding capabilities.
Multi-modal chat assistant
An interactive chat assistant that combines images and text.
Supports intelligent responses to natural language and visual inputs.
Featured Recommended AI Models
Š 2025AIbase