F

Finetune VQA 1B

Developed by TienAnh
A visual question answering model fine-tuned based on InternVL3-1B and Vintern-1B-v3_5, supporting Vietnamese, suitable for image content understanding and question-answering tasks.
Downloads 20
Release Time : 5/10/2025

Model Overview

This model is a visual question answering (VQA) model capable of understanding image content and answering related questions. Fine-tuned based on the InternVL3-1B and Vintern-1B-v3_5 architectures, it is specifically optimized for Vietnamese language support.

Model Features

Multi-slice Image Processing
Supports dynamic image preprocessing, automatically dividing images into multiple slices to maintain aspect ratio and improve processing efficiency.
Vietnamese Optimization
Specifically optimized and fine-tuned for Vietnamese, performing well in Vietnamese visual question-answering tasks.
Efficient Inference
Supports bfloat16 precision and optional flash attention, improving inference speed while maintaining accuracy.

Model Capabilities

Image Content Understanding
Visual Question Answering
Key Information Extraction from Images
Multilingual Support (Primarily Vietnamese)

Use Cases

Education
Vietnamese Learning Assistance
Helps students understand Vietnamese vocabulary and expressions through images.
Enhances language learning efficiency and engagement.
Content Moderation
Image Content Analysis
Automatically analyzes image content and answers related questions.
Improves moderation efficiency and accuracy.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase