L

Llama 3.2 11B Vision Instruct Abliterated 8 Bit

Developed by mlx-community
This is a multimodal model based on Llama-3.2-11B-Vision-Instruct, which supports image and text input and generates text output.
Downloads 128
Release Time : 12/16/2024

Model Overview

This model is a vision-language model that can process image and text input and generate corresponding text output. It is suitable for multimodal tasks such as visual question answering and image description generation.

Model Features

Multimodal support
It can process image and text input simultaneously and generate text output.
Multilingual support
Supports multiple languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
8-bit quantization
The model has been processed with 8-bit quantization, reducing memory usage and computational resource requirements.

Model Capabilities

Image understanding
Text generation
Multimodal inference
Visual question answering

Use Cases

Visual question answering
Question answering about image content
Generate corresponding answers based on the input image and questions.
Image description generation
Automatic image description
Generate descriptive text based on the input image.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase