N

Nanovlm

Developed by andito
nanoVLM is a lightweight vision-language model (VLM) designed for efficient training and experimentation.
Downloads 187
Release Time : 5/26/2025

Model Overview

nanoVLM combines a ViT-based image encoder and a lightweight causal language model to form a compact vision-language model suitable for multimodal tasks.

Model Features

Lightweight Design
The entire model architecture and training logic consist of only about 750 lines of code, facilitating understanding and experimentation.
Compact Parameters
After combining the image encoder and the language model, there are only 222 million parameters, suitable for efficient training and deployment.

Model Capabilities

Image-Text Generation
Multimodal Understanding

Use Cases

Research Experiment
Vision-Language Model Research
Used to study the performance and efficiency of lightweight vision-language models.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase