# Image Text Generation
Mistral Community Pixtral 12b GGUF
Apache-2.0
This is the quantized version of the pixtral-12b model, quantized using llama.cpp, supporting image-text-to-text tasks.
M
bartowski
1,728
4
Vitucano 2b8 V1
Apache-2.0
ViTucano is the first natively Portuguese pre-trained visual assistant, combining visual understanding and language capabilities, suitable for multimodal tasks such as image captioning and visual question answering.
Image-to-Text
Transformers Other

V
TucanoBR
86
5
Git Base Vqav2
MIT
GIT is a Transformer decoder-based vision-language model trained with CLIP image tokens and text token conditioning, suitable for tasks like image captioning and visual question answering.
Image-to-Text
Transformers Supports Multiple Languages

G
microsoft
199
19
Featured Recommended AI Models