S

Smolvlm2 2.2B Instruct I1 GGUF

Developed by mradermacher
SmolVLM2-2.2B-Instruct is a vision-language model with a parameter scale of 2.2B, focusing on video text-to-text tasks and supporting English.
Downloads 285
Release Time : 4/25/2025

Model Overview

This model is a quantized version of the vision-language model, trained on multiple video and text datasets, suitable for video content understanding and generation tasks.

Model Features

Trained on multiple datasets
The model is trained on multiple high-quality video and text datasets, including the_cauldron, Docmatix, LLaVA-OneVision-Data, etc.
Diverse quantization versions
Multiple quantization versions are provided, ranging from the extremely low-quality IQ1_S to the high-quality Q6_K, meeting different hardware and performance requirements.
Video understanding ability
Focuses on the understanding of video content and text generation, suitable for tasks such as video subtitle generation and video content analysis.

Model Capabilities

Video content understanding
Text generation
Video subtitle generation
Multimodal reasoning

Use Cases

Video content analysis
Video subtitle generation
Generate descriptive subtitles for video content
Video content summarization
Extract key information from the video and generate a summary
Education
Educational video explanation
Generate explanatory text for educational videos
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase