dart-v2-moe-sft開源圖像標籤生成模型 - 輕鬆打造Danbooru風格標籤

首頁

Dart V2 Moe Sft

由p1atdev開發

Dart v2是經過微調的Danbooru標籤生成模型，基於Mixtral架構，專門用於生成Danbooru風格的圖像標籤。

大型語言模型

Transformers

開源協議:Apache-2.0 #Danbooru標籤生成 #二次元圖像標註 #Mixtral架構優化

下載量 5,575

發布時間 : 5/6/2024

模型概述

該模型能夠根據輸入的提示生成符合Danbooru風格的圖像標籤，支持多種評級、寬高比和長度設置，適用於圖像標註和標籤生成任務。

模型特點

多參數控制

支持控制標籤的評級、寬高比、長度和身份保留程度等多種參數

Mixtral架構

基於高效的Mixtral架構，提供高質量的標籤生成能力

多種變體選擇

提供不同架構和規模的模型變體，滿足不同需求

模型能力

Danbooru標籤生成

圖像標籤自動生成

多參數標籤控制

使用案例

圖像標註

動漫圖像標籤生成

為動漫風格圖像生成詳細的Danbooru標籤

生成包含角色、服裝、表情等詳細描述的標籤

內容創作輔助

AI繪畫提示生成

為AI繪畫工具生成詳細的提示標籤

提供結構化、詳細的繪畫提示

🚀 Dart (Danbooru Tags Transformer) v2

Dart (Danbooru Tags Transformer) v2 是一個經過微調的模型，專門用於生成 Danbooru 標籤。它能根據輸入的相關信息，準確且高效地生成對應的標籤，為相關應用場景提供了有力支持。

Demo: 🤗 Space with ZERO

✨ 主要特性

模型變體

名稱	架構	參數規模	類型
v2-moe-sft	Mixtral	1.66 億	監督微調（SFT）
v2-moe-base	Mixtral	1.66 億	預訓練
v2-sft	Mistral	1.14 億	監督微調（SFT）
v2-base	Mistral	1.14 億	預訓練
v2-vectors	嵌入層	-	標籤嵌入

📦 安裝指南

使用 📦`dartrs` 庫

pip install -U dartrs

💻 使用示例

基礎用法

使用 🤗Transformers

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_NAME = "p1atdev/dart-v2-moe-sft"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, torch_dtype=torch.bfloat16)

prompt = (
    f"<|bos|>"
    f"<copyright>vocaloid</copyright>"
    f"<character>hatsune miku</character>"
    f"<|rating:general|><|aspect_ratio:tall|><|length:long|>"
    f"<general>1girl, cat ears<|identity:none|><|input_end|>"
)
inputs = tokenizer(prompt, return_tensors="pt").input_ids

with torch.no_grad():
  outputs = model.generate(
    inputs,
    do_sample=True,
    temperature=1.0,
    top_p=1.0,
    top_k=100,
    max_new_tokens=128,
    num_beams=1,
  )

print(", ".join([tag for tag in tokenizer.batch_decode(outputs[0], skip_special_tokens=True) if tag.strip() != ""]))
# vocaloid, hatsune miku, 1girl, cat ears, closed mouth, detached sleeves, dress, expressionless, from behind, full body, green theme, hair ornament, hair ribbon, headphones, high heels, holding, holding microphone, long hair, microphone, monochrome, necktie, ribbon, short dress, shoulder tattoo, simple background, sleeveless, sleeveless dress, spot color, standing, tattoo, thighhighs, twintails, very long hair, white background

高級用法

使用 📦`dartrs` 庫

⚠️ 重要提示

這個庫目前還處於實驗階段，未來可能會有重大變更。

📦dartrs 是一個基於 🤗candle 後端的 Dart v2 模型推理庫。

from dartrs.dartrs import DartTokenizer
from dartrs.utils import get_generation_config
from dartrs.v2 import (
    compose_prompt,
    MixtralModel,
    V2Model,
)
import time
import os

MODEL_NAME = "p1atdev/dart-v2-moe-sft"

model = MixtralModel.from_pretrained(MODEL_NAME)
tokenizer = DartTokenizer.from_pretrained(MODEL_NAME)

config = get_generation_config(
    prompt=compose_prompt(
        copyright="vocaloid",
        character="hatsune miku",
        rating="general", # sfw, general, sensitive, nsfw, questionable, explicit
        aspect_ratio="tall", # ultra_wide, wide, square, tall, ultra_tall
        length="medium", # very_short, short, medium, long, very_long
        identity="none", # none, lax, strict
        prompt="1girl, cat ears",
    ),
    tokenizer=tokenizer,
)

start = time.time()
output = model.generate(config)
end = time.time()

print(output)
print(f"Time taken: {end - start:.2f}s")
# cowboy shot, detached sleeves, empty eyes, green eyes, green hair, green necktie, hair in own mouth, hair ornament, letterboxed, light frown, long hair, long sleeves, looking to the side, necktie, parted lips, shirt, sleeveless, sleeveless shirt, twintails, wing collar
# Time taken: 0.26s

📚 詳細文檔

提示格式

prompt = (
    f"<|bos|>"
    f"<copyright>{copyright_tags_here}</copyright>"
    f"<character>{character_tags_here}</character>"
    f"<|rating:general|><|aspect_ratio:tall|><|length:long|>"
    f"<general>{general_tags_here}<|identity:none|><|input_end|>"
)

評級標籤：<|rating:sfw|>、<|rating:general|>、<|rating:sensitive|>、nsfw、<|rating:questionable|>、<|rating:explicit|>
- sfw：隨機生成 general 或 sensitive 評級類別的標籤。
- general：生成 general 評級類別的標籤。
- sensitive：生成 sensitive 評級類別的標籤。
- nsfw：隨機生成 questionable 或 explicit 評級類別的標籤。
- questionable：生成 questionable 評級類別的標籤。
- explicit：生成 explicit 評級類別的標籤。
寬高比標籤：<|aspect_ratio:ultra_wide|>、<|aspect_ratio:wide|>、<|aspect_ratio:square|>、<|aspect_ratio:tall|>、<|aspect_ratio:ultra_tall|>
- ultra_wide：生成適合極寬寬高比圖像（約 2:1）的標籤。
- wide：生成適合寬寬高比圖像（2:1 - 9:8）的標籤。
- square：生成適合正方形寬高比圖像（9:8 - 8:9）的標籤。
- tall：生成適合高寬高比圖像（8:9 - 1:2）的標籤。
- ultra_tall：生成適合極高寬高比圖像（1:2 及以上）的標籤。
長度標籤：<|length:very_short|>、<|length:short|>、<|length:medium|>、<|length:long|>、<|length:very_long|>
- very_short：總共生成約 10 個標籤。
- short：總共生成約 20 個標籤。
- medium：總共生成約 30 個標籤。
- long：總共生成約 40 個標籤。
- very_long：總共生成 40 個以上的標籤。
身份標籤：<|identity:none|>、<|identity:lax|>、<|identity:strict|>
- 此標籤指定了在生成標籤時，對所提供標籤中角色或主體身份的保留嚴格程度。
- none：當指定的通用標籤非常少的時候推薦使用。它會非常有創造性地生成標籤，但有時會忽略通用標籤的條件。
- lax：如果您希望保留通用標籤中角色或主體的身份，推薦使用此標籤。它會盡量不生成與輸入通用標籤衝突的標籤。
- strict：如果您強烈希望保留通用標籤中角色或主體的身份，推薦使用此標籤。它會比 lax 更嚴格地避免生成與輸入通用標籤衝突的標籤，但創造性較差。如果您不喜歡 strict 的結果，請嘗試 lax 或 none。

模型詳情

模型描述

屬性	詳情
開發者	Plat
模型類型	因果語言模型
語言（NLP）	Danbooru 標籤
許可證	Apache-2.0
微調基礎模型	dart-v2-moe-base
演示	可在 🤗 Space 上查看