🚀 Taiyi-Stable-Diffusion-1B-Chinese-v0.1
The first open-source Chinese Stable Diffusion model, trained on 20 million filtered Chinese image-text pairs, enabling high-quality text-to-image generation.
🚀 Quick Start
You can experience our model on Taiyi-Stable-Diffusion-Chinese.
✨ Features
- Open Source: It is the first open-source Chinese Stable Diffusion model.
- Multilingual Support: Supports both Chinese and English.
- High - Quality Training: Trained on 20 million filtered Chinese image - text pairs.
📦 Installation
No specific installation steps are provided in the original document.
💻 Usage Examples
Basic Usage
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1").to("cuda")
prompt = '飞流直下三千尺,油画'
image = pipe(prompt, guidance_scale=7.5).images[0]
image.save("飞流.png")
Advanced Usage
import torch
from diffusers import StableDiffusionPipeline
torch.backends.cudnn.benchmark = True
pipe = StableDiffusionPipeline.from_pretrained("IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1", torch_dtype=torch.float16)
pipe.to('cuda')
prompt = '飞流直下三千尺,油画'
image = pipe(prompt, guidance_scale=7.5).images[0]
image.save("飞流.png")
Handbook for Taiyi
For more usage information, please refer to the handbook.
How to finetune
If you want to finetune the model, please refer to this guide.
Configure webui
For webui configuration, please refer to this README.
DreamBooth
For DreamBooth related content, please refer to this repository.
📚 Documentation
Model Taxonomy
Property |
Details |
Demand |
Special |
Task |
Multimodal |
Series |
Taiyi |
Model |
Stable Diffusion |
Parameter |
1B |
Extra |
Chinese |
Model Information
We used the Noah-Wukong dataset (100M) and the Zero dataset (23M) as our pre - training datasets. First, we used IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese to score the similarity of image - text pairs in these two datasets, and selected the image - text pairs with a CLIP Score greater than 0.2 as our training set. We used IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese as the initial text encoder, froze other parts of the stable-diffusion-v1-4 (paper) model, and only trained the text encoder to retain the generative ability of the original model and achieve the alignment of Chinese concepts. This model has been trained for one epoch on 20 million image - text pairs. We trained it on 32 x A100 for about 100 hours. This version is just a preliminary one, and we will continuously optimize and open - source subsequent models. Welcome to communicate with us!
Result
Basic Prompt
铁马冰河入梦来,3D绘画。 |
飞流直下三千尺,油画。 |
女孩背影,日落,唯美插画。 |
 |
 |
 |
Advanced Prompt
铁马冰河入梦来,概念画,科幻,玄幻,3D |
中国海边城市,科幻,未来感,唯美,插画。 |
那人却在灯火阑珊处,色彩艳丽,古风,资深插画师作品,桌面高清壁纸。 |
 |
 |
 |
📄 License
This model is under the CreativeML OpenRAIL - M license. Please read the full license here.
📖 Citation
If you use our model in your work, you can cite our paper:
@article{fengshenbang,
author = {Jiaxing Zhang and Ruyi Gan and Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen},
title = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
journal = {CoRR},
volume = {abs/2209.02970},
year = {2022}
}
You can also cite our website:
@misc{Fengshenbang-LM,
title={Fengshenbang-LM},
author={IDEA-CCNL},
year={2021},
howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}