T

Taivisionlm Base V2

Developed by benchang1110
The first vision-language model supporting Traditional Chinese instruction input (1.2B parameters), compatible with Transformers library, quick to load and easy to fine-tune
Downloads 122
Release Time : 9/17/2024

Model Overview

Multimodal large language model combining SigLIP visual encoder with Tinyllama language model, connected via visual projector, specifically designed for Traditional Chinese visual language tasks

Model Features

Traditional Chinese support
First vision-language model specifically supporting Traditional Chinese instruction input
Efficient architecture
Lightweight design with only 1.2B parameters, maintaining high performance while reducing computational requirements
Transformers compatibility
Fully compatible with Hugging Face Transformers library, no additional dependencies required
Multi-stage training
Adopts three-phase development process: unimodal pretraining, feature alignment, and task-specific training

Model Capabilities

Image caption generation
Visual question answering
Multimodal understanding
Traditional Chinese text generation

Use Cases

Content understanding
Image captioning
Generate detailed Traditional Chinese descriptions for images
Version v2 provides more detailed visual element analysis compared to v1
Visual question answering
Answer Traditional Chinese questions about image content
Educational applications
Learning assistance
Help Traditional Chinese users understand visual content
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase