I

Internvl2 5 HiMTok 8B

Developed by yayafengzi
HiMTok is a hierarchical mask token learning framework fine-tuned on the InternVL2_5-8B large multimodal model, focusing on image segmentation tasks.
Downloads 16
Release Time : 3/20/2025

Model Overview

This model achieves efficient image segmentation through a hierarchical mask token learning framework, particularly suitable for tasks on the refcoco series datasets.

Model Features

Hierarchical Mask Token Learning
Adopts a hierarchical structure for image segmentation tasks, improving segmentation accuracy and efficiency.
Multimodal Capability
Combines vision and language understanding to support complex image segmentation tasks.
Based on Large Pre-trained Models
Fine-tuned on InternVL2_5-8B, featuring powerful feature extraction capabilities.

Model Capabilities

Image Segmentation
Mask Generation
Multimodal Understanding
Vision-Language Task Processing

Use Cases

Computer Vision
Referring Image Segmentation
Segments specific regions in an image based on textual descriptions.
Performs well on the refcoco series datasets.
Interactive Image Editing
Guides image editing and modification through natural language instructions.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase