D

Dimple 7B

Developed by rp-yu
Dimple is the first discrete diffusion multimodal large language model (DMLLM) that combines autoregressive and diffusion training paradigms. After training on the same dataset as LLaVA-NEXT, it outperforms LLaVA-NEXT-7B by 3.9%.
Downloads 422
Release Time : 5/19/2025

Model Overview

Dimple is a multimodal large language model that integrates autoregressive and diffusion training paradigms, supporting image-to-text and text-to-text tasks.

Model Features

Hybrid Training
Combines autoregressive and diffusion training paradigms to enhance model performance.
Diffusion Decoding
Supports confidence decoding, random decoding, maskgit-style decoding, and entropy-based decoding.
Controlled Generation
Achieves fine-grained control over format, structure, and length through structural priors.
Autoregressive-like Prefilling
Uses prefilling techniques to improve inference speed.

Model Capabilities

Image Caption Generation
Multimodal Instruction Following
Text Generation
Image Analysis

Use Cases

Multimodal Interaction
Image Captioning
Generate detailed descriptions of images.
Produces natural and accurate image captions.
Visual Question Answering
Answer questions about image content.
Provides accurate and contextually relevant answers.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase