The open-source model Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1 is free and practical for bilingual (Chinese and English) image generation!

Taiyi Stable Diffusion 1B Chinese EN V0.1

Developed by IDEA-CCNL

The first open-source Chinese-English bilingual Stable Diffusion model, trained on 20 million filtered Chinese image-text pairs

Text-to-Image ChineseOpen Source License:Openrail #Chinese-English Bilingual Generation #Art Style Transfer #High-Resolution Image Generation

Downloads 182

Release Time : 11/1/2022

Model Overview

A Chinese-English bilingual text-to-image generation model based on the Stable Diffusion architecture, supporting high-quality image generation from Chinese prompts

Model Features

Chinese-English Bilingual Support

The first open-source Stable Diffusion model supporting both Chinese and English prompts

High-Quality Training Data

Trained on 20 million Chinese image-text pairs filtered with CLIP score >0.2

Two-Stage Training

First freezes text encoder for concept alignment, then jointly trains to optimize Chinese prompt adaptability

Model Capabilities

Text-to-Image Generation

Art Style Transfer

Creative Image Synthesis

Use Cases

Art Creation

Famous Painting Style Transfer

Transform ordinary scenes into painting styles of artists like Van Gogh

Example images show the Van Gogh style transformation effect of a small bridge and flowing water scene

Watercolor Painting Generation

Generate watercolor-style images based on Chinese descriptions

Example images show the watercolor style effect of a small bridge and flowing water scene

Creative Design

Fun Image Synthesis

Generate creative composite images, such as a husky in a spacesuit

Example images show a husky wearing a spacesuit

Artistic Transformation of Daily Scenes

Transform daily scenes into artistic styles, such as a cat eating rice noodles

Example images show the artistic effect of a cat eating rice noodles

🚀 Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1

The first open-source Chinese-English bilingual Stable Diffusion model, trained on 200 million filtered Chinese image-text pairs, offering high-quality text-to-image generation capabilities.

🚀 Quick Start

We support a Gradio Web UI to run Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1:

✨ Features

Bilingual Support: It is the first open-source Chinese-English bilingual Stable Diffusion model, enabling text-to-image generation in both Chinese and English.
High - Quality Training: Trained on 200 million filtered Chinese image-text pairs, ensuring high - quality generation results.

📦 Installation

This section is not provided in the original document, so it is skipped.

💻 Usage Examples

Basic Usage

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1").to("cuda")

prompt = '小桥流水人家，Van Gogh style'
image = pipe(prompt, guidance_scale=10).images[0]  
image.save("小桥.png")

Advanced Usage

Adding torch_dtype=torch.float16 and device_map="auto" can quickly load FP16 weights to speed up inference. For more information, see the optimization docs.

# !pip install git+https://github.com/huggingface/accelerate
from diffusers import StableDiffusionPipeline
import torch
torch.backends.cudnn.benchmark = True
pipe = StableDiffusionPipeline.from_pretrained("IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1", torch_dtype=torch.float16)
pipe.to('cuda')

prompt = '小桥流水人家，Van Gogh style'
image = pipe(prompt, guidance_scale=10.0).images[0]  
image.save("小桥.png")

📚 Documentation

Model Taxonomy

Property	Details
Demand	Special
Task	Multimodal
Series	Taiyi
Model	Stable Diffusion
Parameter	1B
Extra	Chinese and English

Model Information

We use Noah-Wukong(100M) and Zero(23M) as our dataset, and take the image and text pairs with CLIP Score (based on IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese) greater than 0.2 as our Training set. We finetune the stable-diffusion-v1-4(paper) model for two stage.

Stage 1: To keep the powerful generative capability of stable diffusion and align Chinese concepts with the images, We only train the text encoder and freeze other part of the model in the first stage.

Stage 2: We unfreeze both the text encoder and the diffusion model, therefore the diffusion model can have a better compatibility for the Chinese language guidance.

It takes 80 hours to train the first stage, 100 hours to train the second stage, both stages are based on 8 x A100. This model is a preliminary version and we will update this model continuously and open source. Welcome to exchange！

Result

小桥流水人家，Van Gogh style.
小桥流水人家，水彩。
吃过桥米线的猫。
穿着宇航服的哈士奇。

🔧 Technical Details

This section is not provided in the original document, so it is skipped.

📄 License

This model is under the CreativeML OpenRAIL - M license.

One more step before getting this model. This model is open access and available to all, with a CreativeML OpenRAIL - M license further specifying rights and usage. The CreativeML OpenRAIL License specifies:

You can't use the model to deliberately produce nor share illegal or harmful outputs or content
IDEA - CCNL claims no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in the license
You may re - distribute the weights and use the model commercially and/or as a service. If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL - M to all your users (please read the license entirely and carefully) Please read the full license here: https://huggingface.co/spaces/CompVis/stable-diffusion-license

By clicking on "Access repository" below, you accept that your contact information (email address and username) can be shared with the model authors as well.

📖 Citation

If you are using the resource for your work, please cite the our paper:

@article{fengshenbang,
  author    = {Jiaxing Zhang and Ruyi Gan and Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen},
  title     = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
  journal   = {CoRR},
  volume    = {abs/2209.02970},
  year      = {2022}
}

You can also cite our website:

@misc{Fengshenbang-LM,
  title={Fengshenbang-LM},
  author={IDEA-CCNL},
  year={2021},
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご