SegMoE-4x2-v0 Open-source Image Generation Model - Combining Experts in SDXL for Even Stronger Image Generation Capabilities

Segmoe 4x2 V0

Developed by segmind

SegMoE-4x2-v0 is an untrained Segmind Mixture of Diffusion Experts model, dynamically composed of 4 expert-level SDXL models, featuring a broader knowledge base and enhanced image generation capabilities.

Image Generation Open Source License:Apache-2.0 #Mixture of Diffusion Experts #Training-Free Composition #Hyper-Realistic Generation

Downloads 1,389

Release Time : 1/29/2024

Model Overview

SegMoE is a powerful framework that dynamically combines multiple Stable Diffusion models into a Mixture of Experts within minutes, without requiring training. This framework enables the instant creation of larger models with a broader knowledge base, better prompt adherence, and superior image quality.

Model Features

Dynamic Expert Model Composition

Dynamically combines multiple expert-level SDXL models without training to form a more powerful model.

Broad Knowledge Base

Integrates knowledge from multiple expert models for broader understanding and generation capabilities.

High-Quality Image Generation

Enhances image quality and prompt adherence through the Mixture of Experts.

Training-Free

The model composition process requires no additional training steps.

Model Capabilities

Text-to-Image Generation

Hyper-Realistic Image Generation

Multi-Style Image Generation

Use Cases

Creative Design

Concept Art Creation

Generate concept art images for games, films, etc.

High-quality, diverse concept artworks.

Advertising Design

Create visual materials for advertisements.

Professional-grade advertising images.

Content Creation

Social Media Content

Generate engaging visual content for social media platforms.

Diverse styles of social media images.

Illustration Creation

Create illustrations for books, magazines, etc.

Artistically rich illustrations.

🚀 SegMoE-4x2-v0: Segmind Mixture of Diffusion Experts

SegMoE-4x2-v0 is an untrained Segmind Mixture of Diffusion Experts Model, offering enhanced image generation capabilities by combining multiple SDXL models.

SegMoE-4x2-v0 is an untrained Segmind Mixture of Diffusion Experts Model generated using segmoe from 4 Expert SDXL models. SegMoE is a powerful framework for dynamically combining Stable Diffusion Models into a Mixture of Experts within minutes without training. The framework allows for creation of larger models on the fly which offer larger knowledge, better adherence and better image quality.

🚀 Quick Start

This model can be used via the segmoe library.

📦 Installation

Make sure to install segmoe by running

pip install segmoe

💻 Usage Examples

Basic Usage

from segmoe import SegMoEPipeline

pipeline = SegMoEPipeline("segmind/SegMoE-4x2-v0", device = "cuda")

prompt = "cosmic canvas, orange city background, painting of a chubby cat"
negative_prompt = "nsfw, bad quality, worse quality"
img = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=1024,
    width=1024,
    num_inference_steps=25,
    guidance_scale=7.5,
).images[0]
img.save("image.png")

image/png

Config

Config Used to create this Model is:

base_model: SG161222/RealVisXL_V3.0
num_experts: 4
moe_layers: all
num_experts_per_tok: 2
experts:
  - source_model: frankjoshua/juggernautXL_v8Rundiffusion
    positive_prompt: "aesthetic, cinematic, hands, portrait, photo, illustration, 8K, hyperdetailed, origami, man, woman, supercar"
    negative_prompt: "(worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art:1.4), (watermark, signature, text font, username, error, logo, words, letters, digits, autograph, trademark, name:1.2), (blur, blurry, grainy), morbid, ugly, asymmetrical, mutated malformed, mutilated, poorly lit, bad shadow, draft, cropped, out of frame, cut off, censored, jpeg artifacts, out of focus, glitch, duplicate, (airbrushed, cartoon, anime, semi-realistic, cgi, render, blender, digital art, manga, amateur:1.3), (3D ,3D Game, 3D Game Scene, 3D Character:1.1), (bad hands, bad anatomy, bad body, bad face, bad teeth, bad arms, bad legs, deformities:1.3)"
  - source_model: SG161222/RealVisXL_V3.0
    positive_prompt: "cinematic, portrait, photograph, instagram, fashion, movie, macro shot, 8K, RAW, hyperrealistic, ultra realistic,"
    negative_prompt: "(octane render, render, drawing, anime, bad photo, bad photography:1.3), (worst quality, low quality, blurry:1.2), (bad teeth, deformed teeth, deformed lips), (bad anatomy, bad proportions:1.1), (deformed iris, deformed pupils), (deformed eyes, bad eyes), (deformed face, ugly face, bad face), (deformed hands, bad hands, fused fingers), morbid, mutilated, mutation, disfigured"
  - source_model: albertushka/albertushka_DynaVisionXL
    positive_prompt: "minimalist, illustration, award winning art, painting, impressionist, comic, colors, sketch, pencil drawing,"
    negative_prompt: "Compression artifacts, bad art, worst quality, low quality, plastic, fake, bad limbs, conjoined, featureless, bad features, incorrect objects, watermark, ((signature):1.25), logo"
  - source_model: frankjoshua/albedobaseXL_v13
    positive_prompt: "photograph f/1.4, ISO 200, 1/160s, 8K, RAW, unedited, symmetrical balance, in-frame, 8K"
    negative_prompt: "nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, blurry"

Other Variants

We release 3 merges on Hugging Face,

SegMoE 2x1 has two expert models.
SegMoE SD 4x2 has four Stable Diffusion 1.5 expert models.

Comparison

The Prompt Understanding seems to improve as shown in the images below. From Left to Right SegMoE-2x1-v0, SegMoE-4x2-v0, Base Model (RealVisXL_V3.0)

three green glass bottles

panda bear with aviator glasses on its head

the statue of Liberty next to the Washington Monument

Model Description

Property	Details
Developed by	Segmind
Developers	Yatharth Gupta and Vishnu Jaddipal
Model Type	Diffusion-based text-to-image generative mixture of experts model
License	Apache 2.0

Out-of-Scope Use

⚠️ Important Note

The SegMoE-4x2-v0 Model is not suitable for creating factual or accurate representations of people, events, or real-world information. It is not intended for tasks requiring high precision and accuracy.

✨ Features

Benefits from The Knowledge of Several Finetuned Experts
Training Free
Better Adaptability to Data
Model Can be upgraded by using a better finetuned model as one of the experts.

🔧 Technical Details

Limitations

Though the Model improves upon the fidelity of images as well as adherence, it does not be drastically better than any one expert without training and relies on the knowledge of the experts.
This is not yet optimized for speed.
The framework is not yet optimized for memory usage.

📄 License

The SegMoE-4x2-v0 model is released under the Apache 2.0 license.

📚 Citation

@misc{segmoe,
  author = {Yatharth Gupta, Vishnu V Jaddipal, Harish Prabhala},
  title = {SegMoE},
  year = {2024},
  publisher = {HuggingFace},
  journal = {HuggingFace Models},
  howpublished = {\url{https://huggingface.co/segmind/SegMoE-4x2-v0}}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご