Qwen2-VL-72B-Instruct-GGUF Open-Source Multimodal Model - Supports multimodal tasks and can be run on GaiaNet.

Qwen2 VL 72B Instruct GGUF

Developed by gaianet

Qwen2-VL-72B-Instruct-GGUF is a quantized version of the original model, supporting multimodal tasks and can be run through GaiaNet.

Downloads 1,803

Release Time : 12/15/2024

Model Overview

This is a multimodal model that supports image-text to text tasks and is suitable for complex visual language understanding and generation tasks.

Multimodal support

Supports the joint processing of images and text, suitable for complex visual language tasks.

High parameter count

With 72 billion parameters, it has powerful understanding and generation capabilities.

Quantized version

After quantization processing, it is convenient to run on devices with limited resources.

Image understanding

Text generation

Multimodal inference

Visual question answering

Image description generation

Generate detailed text descriptions based on the input images.

Document understanding

Document content extraction

Extract key information from documents in images and generate structured text.

Property	Details
Base Model	Qwen/Qwen2-VL-72B-Instruct
Model Creator	Qwen
Model Name	Qwen2-VL-72B-Instruct
Quantized By	Second State Inc.
Language	en
Pipeline Tag	image-text-to-text
Tags	multimodal
Library Name	transformers
License	other
License Name	tongyi-qianwen
License Link	https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct/blob/main/LICENSE

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base