🚀 BRIA 3.1 - Text-to-Image Model
BRIA 3.1 is a new text-to-image model that generates high-quality images. It's trained on fully licensed data, offering both API access and direct access to model weights for seamless integration. With 4 billion parameters, it's lightweight yet delivers high visual fidelity and strong prompt alignment.
🚀 Quick Start
The BRIA 3.1 model can be accessed in multiple ways. You can use the API endpoint, integrate it into ComfyUI workflows, or purchase the model weights for direct use.
✨ Features
- Improved Aesthetics: Generate highly appealing images in photorealistic, illustrative, and graphic styles.
- High Prompt Alignment: Ensure precise adherence to user-provided textual descriptions for accurate outputs.
- Legally Compliant: Provide full legal liability coverage for copyright and privacy infringements, thanks to 100% licensed training data.
- Attribution Engine: Use a proprietary patented attribution engine to compensate data partners based on generated images.
- Customizable Technology: Gain access to source code and weights for extensive customization.
📦 Installation
To use the BRIA 3.1 model, you can install the necessary libraries using the following command:
pip install diffusers, hf_hub_download
💻 Usage Examples
Basic Usage
from huggingface_hub import hf_hub_download
import os
try:
local_dir = os.path.dirname(__file__)
except:
local_dir = '.'
hf_hub_download(repo_id="briaai/BRIA-3.1", filename='pipeline_bria.py', local_dir=local_dir)
hf_hub_download(repo_id="briaai/BRIA-3.1", filename='transformer_bria.py', local_dir=local_dir)
hf_hub_download(repo_id="briaai/BRIA-3.1", filename='bria_utils.py', local_dir=local_dir)
import torch
from pipeline_bria import BriaPipeline
pipe = BriaPipeline.from_pretrained("briaai/BRIA-3.1", torch_dtype=torch.bfloat16,trust_remote_code=True)
pipe.to(device="cuda")
prompt = "A portrait of a Beautiful and playful ethereal singer, golden designs, highly detailed, blurry background"
negative_prompt = "Logo,Watermark,Ugly,Morbid,Extra fingers,Poorly drawn hands,Mutation,Blurry,Extra limbs,Gross proportions,Missing arms,Mutated hands,Long neck,Duplicate,Mutilated,Mutilated hands,Poorly drawn face,Deformed,Bad anatomy,Cloned face,Malformed limbs,Missing legs,Too many fingers"
images = pipe(prompt=prompt, negative_prompt=negative_prompt, height=1024, width=1024).images[0]
📚 Documentation
Get Access
Tips for Inference
⚠️ Important Note
Here are some tips for using our text-to-image model at inference:
- Using negative prompt is recommended.
- For Fine-tuning, use zeros instead of null text embedding.
- We support multiple aspect ratios, yet resolution should overall consists approximately
1024*1024=1M
pixels, for example: ((1024,1024), (1280, 768), (1344, 768), (832, 1216), (1152, 832), (1216, 832), (960,1088)
- Use 30 - 50 steps (higher is better).
- Use
guidance_scale
of 5.0.
Training Data and Attribution
BRIA 3.1 was trained on 100% licensed data from leading data partners. The dataset excludes copyrighted materials like fictional characters, logos, trademarks, public figures, harmful content, or privacy-infringing content. A Patented Attribution Engine is used to fairly compensate data partners based on generated images, ensuring legal compliance.
🔧 Technical Details
These advancements were made possible through several key technical upgrades:
First, we augmented our large dataset with synthetic captions generated by cutting-edge vision-language models. Then, we improve our architecture by integrating state-of-the-art transformers, specifically using MMDIT and DIT layers, while training with a rectified flows objective. This approach is similar to other open models, such as AuraFlow, Flux, and SD3. BRIA 3.1 also employs 2D RoPE for positional embeddings, KQ normalization for enhanced training stability, and noise shifting for high-resolution training.
To ensure affordable inference and fine-tuning, BRIA 3.1 is designed to be compact, consisting of 28 MMDIT layers and 8 DIT layers, totaling 4 billion parameters. We exclusively use the T5 text encoder, avoiding CLIP to minimize unwanted biases. For spatial compression, we employ an open-source VAE f8, after confirming that the VAE does not introduce bias into the model.
Our base model is not distilled and natively supports classifier-free guidance, offering full flexibility for fine-tuning.
Additionally, BRIA 3.1 is trained on multiple aspect ratios and resolutions, allowing it to natively produce 1-megapixel images both horizontally and vertically.
Finally, we also provide full support for diffusers code libraries and ComfyUI, enabling fast experimentation and deployment.
Fine-tuning code would be provided soon.
📄 License
The model is under the bria-t2i license. Model weights from BRIA AI can be obtained with the purchase of a commercial license. Fill in the form below and we reach out to you. Need API Access? Get it here (1K Monthly Free API Calls). Startup or a student? Get access by applying for our Startup Program
Property |
Details |
Model Type |
text-to-image |
Training Data |
100% licensed data from leading data partners, excluding copyrighted materials |