Quilt-1m-finetuned-sd3.5 Open Source Model - Focused on Generating Pathology Images

Quilt 1m Finetuned Sd3.5

Developed by Minh-Ha

A full-rank fine-tuned model based on sd3/unknown-model, specializing in pathology image generation

Image Generation Open Source License:Other #Pathology Image Generation #High-Resolution Diffusion Model #FlowMatch Sampling Optimization

Downloads 300

Release Time : 5/6/2025

Model Overview

This model is an image generation model fine-tuned with full-rank based on sd3/unknown-model, primarily used for generating realistic pathology images.

Model Features

Pathology Image Generation

Specializes in generating realistic pathology images for medical research and education.

High-Resolution Support

Supports image generation up to 1024x1024 resolution.

Multi-Resolution Training

Training data includes 512x512, 768x768, and 1024x1024 resolutions to accommodate various needs.

Model Capabilities

Text-to-Image Generation

Image-to-Image Generation

High-Resolution Image Generation

Use Cases

Medical Research

Pathology Image Generation

Generate realistic pathology images for medical research and education.

Generated images can be used to simulate pathology samples, aiding medical education.

Image Enhancement

Low-Resolution Image Enhancement

Enhance low-resolution medical images to high-resolution.

🚀 quilt-1m-finetuned-sd3.5

This project is a full rank finetune derived from sd3/unknown-model. It focuses on text - to - image generation, offering high - quality image outputs with specific validation and training settings.

📄 License

The license for this project is other.

✨ Features

Multiple Modes: Supports text - to - image and image - to - image generation.
Fine - Tuned Model: Derived from sd3/unknown-model with specific training settings.
Gallery of Examples: Provides example images for reference.

📦 Installation

No installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

import torch
from diffusers import DiffusionPipeline

model_id = 'Minh-Ha/quilt-1m-finetuned-sd3.5'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16

prompt = "A photo-realistic pathology image"
negative_prompt = 'blurry, cropped, ugly'

pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.0,
).images[0]

model_output.save("output.png", format="PNG")

📚 Documentation

Validation settings

CFG: 3.0
CFG Rescale: 0.0
Steps: 20
Sampler: FlowMatchEulerDiscreteScheduler
Seed: 42
Resolution: 1024x1024
Skip - layer guidance:

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

Property	Details
Training epochs	0
Training steps	10000
Learning rate	5e - 06 - Learning rate schedule: polynomial - Warmup steps: 100
Max grad value	2.0
Effective batch size	16 - Micro - batch size: 1 - Gradient accumulation steps: 4 - Number of GPUs: 4
Gradient checkpointing	True
Prediction type	flow_matching (extra parameters=['shift=3'])
Optimizer	adamw_bf16
Trainable parameter precision	Pure BF16
Base model precision	`no_change`
Caption dropout probability	0.1%

Datasets

Dataset	Repeats	Total number of images	Total number of aspect buckets	Resolution	Cropped	Crop style	Crop aspect	Used for regularisation data
images - 512	1	~417748	1	0.262144 megapixels	True	random	square	No
images - 768	1	~266740	1	0.589824 megapixels	True	random	square	No
images - 1024	1	~246816	1	1.048576 megapixels	True	random	square	No

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご