Open-source Hidream-i1 model - A practical image generation tool that enables free text-to-image and image-to-text conversion.

Hidream I1

Developed by ControlNetLoRA

A ControlNet PEFT LoRA model based on HiDream-I1-Full, supporting text-to-image and image-to-image conversion

Image Generation Open Source License:Other #ControlNet Fine-tuning #Low VRAM Optimization #Fast Image Generation

Downloads 605

Release Time : 6/19/2025

Model Overview

This is a ControlNet PEFT LoRA adapter based on the HiDream-I1-Full model, mainly used for image generation tasks and has certain image generation capabilities.

Model Features

ControlNet PEFT LoRA Technology

Derived from the HiDream-I1-Full model based on ControlNet PEFT LoRA technology

Efficient Inference

Supports quantization to save VRAM. Quantization is recommended during inference.

Text Encoder Reuse

The text encoder is not trained and can reuse the text encoder of the base model during inference

Model Capabilities

Text-to-Image Generation

Image-to-Image Conversion

Use Cases

Creative Image Generation

Photorealistic Image Generation

Generate photorealistic images based on text prompts

In the example, a photorealistic image of a cat was generated

🚀 hidream-controlnet-lora-test

This project presents a ControlNet PEFT LoRA derived from HiDream-ai/HiDream-I1-Full. It offers capabilities in text - to - image and image - to - image generation, leveraging the power of diffusers and LoRA techniques.

🚀 Quick Start

Prerequisites

Before using this project, ensure you have the necessary libraries installed. You can install them using pip or other package managers.

Inference

The following is an example of how to perform inference with this model:

import torch
from diffusers import DiffusionPipeline

model_id = 'HiDream-ai/HiDream-I1-Full'
adapter_id = 'bghira/hidream-controlnet-lora-test'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "A photo-realistic image of a cat"
negative_prompt = 'ugly, cropped, blurry, low-quality, mediocre average'

## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=16,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=256,
    height=256,
    guidance_scale=4.0,
).images[0]

model_output.save("output.png", format="PNG")

✨ Features

Based on ControlNet PEFT LoRA: Derived from HiDream-ai/HiDream-I1-Full, it provides enhanced control over image generation.
Multiple Generation Modes: Supports text - to - image and image - to - image generation.
Flexible Configuration: Allows users to adjust various parameters during training and inference.

📦 Installation

The installation process mainly involves installing the necessary Python libraries. You can use the following command to install the required libraries:

pip install diffusers torch optimum

💻 Usage Examples

Basic Usage

The code in the Quick Start section demonstrates the basic usage of this model for text - to - image generation.

Advanced Usage

You can adjust the parameters such as prompt, negative_prompt, num_inference_steps, etc., to generate different images according to your needs.

📚 Documentation

Validation settings

CFG: 4.0
CFG Rescale: 0.0
Steps: 16
Sampler: FlowMatchEulerDiscreteScheduler
Seed: 42
Resolution: 256x256

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

Property	Details
Training epochs	0
Training steps	2
Learning rate	0.0001
Learning rate schedule	constant
Warmup steps	500
Max grad value	2.0
Effective batch size	1
Micro - batch size	1
Gradient accumulation steps	1
Number of GPUs	1
Gradient checkpointing	True
Prediction type	flow_matching (extra parameters=['shift=3.0'])
Optimizer	adamw_bf16
Trainable parameter precision	Pure BF16
Base model precision	`int8 - quanto`
Caption dropout probability	0.0%
LoRA Rank	1
LoRA Alpha	1.0
LoRA Dropout	0.1
LoRA initialisation style	default

Datasets

antelope - data - 256

Property	Details
Repeats	0
Total number of images	29
Total number of aspect buckets	1
Resolution	0.065536 megapixels
Cropped	True
Crop style	center
Crop aspect	square
Used for regularisation data	No

📄 License

This project is under the 'other' license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご