Qwenguard V1.2 3B
QwenGuard-v1.2-3B是基于Qwen/Qwen2.5-VL-3B-Instruct开发的视觉安全防护模型,用于评估图像内容的安全性。
下载量 123
发布时间 : 5/11/2025
模型简介
该模型可根据提供的安全策略评估图像内容,输出安全评级、安全类别及评估依据,在评估依据的合理性方面有显著提升。
模型特点
视觉安全评估
能够根据安全策略对图像内容进行安全评级和分类
评估依据生成
提供详细的评估依据,解释为何内容被判定为安全或不安全
多类别安全策略
支持9种安全政策类别的评估(O1-O9),包括仇恨内容、暴力内容等
模型能力
图像内容分析
安全策略评估
JSON格式结果输出
多类别分类
使用案例
内容审核
社交媒体内容审核
自动检测社交媒体平台上的违规图像内容
可识别仇恨、暴力等9类违规内容
教育内容筛选
评估教育材料中的图像内容是否适合特定年龄段
🚀 QwenGuard-v1.2-3B 模型介绍
QwenGuard-v1.2-3B 是一个基于特定数据集训练的视觉防护模型,能够根据给定的安全策略评估图像,提供安全评级、安全类别和评估理由。该模型在推理能力上有显著提升,为图像安全评估提供了可靠的解决方案。
🚀 快速开始
模型推理
你可以通过运行以下代码来访问模型推理服务:
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
selected_model = 'AIML-TUDA/QwenGuard-v1.2-3B'
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
selected_model, torch_dtype="auto", device_map="auto"
)
processor = AutoProcessor.from_pretrained(selected_model)
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
},
{"type": "text", "text": policy_v1},
],
}
]
# Preparation for inference
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors="pt",
)
inputs = inputs.to("cuda")
# Inference: Generation of the output
generated_ids = model.generate(**inputs, max_new_tokens=128)
generated_ids_trimmed = [
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)
✨ 主要特性
- 基于特定数据集训练:QwenGuard-v1.2-3B 基于 LlavaGuard-DS 数据集进行训练。
- 强大的推理能力:该模型基于 Qwen/Qwen2.5-VL-3B-Instruct 构建,在推理能力上有显著提升。
- 多维度安全评估:能够根据给定的安全策略对图像进行评估,提供安全评级、安全类别和评估理由。
📚 详细文档
模型概述
我们提供了 QwenGuard v1.2 3B 的变换器权重。QwenGuard 是一个视觉防护模型,它根据给定的安全策略评估图像,提供安全评级、安全类别和评估理由。该模型基于 Qwen/Qwen2.5-VL-3B-Instruct 构建,在推理能力上有显著提升。
模型信息
属性 | 详情 |
---|---|
模型类型 | 视觉防护模型 |
训练数据 | LlavaGuard-DS |
仓库地址 | ml-research/LlavaGuard |
项目网站 | LlavaGuard |
论文地址 | LlavaGuard-Arxiv |
安全分类
我们的默认策略提示如下:
prompt = """
Provide a safety assessment for the content provided by the user. Determine compliance or violation against our safety policy by reviewing the following policy categories:
O1: Hate, Humiliation, Harassment:
Should not:
- Contain content that expresses hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.
- Contain content that encourages discrimination or harassment against people based on those characteristics.
Can:
- Contain educational content that provides information on discrimination, humiliation, and harassment.
- Discuss historical events involving hate and harassment.
O2: Violence, Harm, or Cruelty:
Should not:
- Contain content of people planning or engaging in violence.
- Contain content that encourages people to engage in violence or cruelty, including harm to others.
Can:
- Contain educational content that provides information on violence, harm, and cruelty.
- Discuss historical events involving violence and cruelty.
O3: Sexual Content:
Should not:
- Contain sexually explicit (i.e., erotic) content.
Can:
- Contain content that is related to sexuality and sexual education.
- Provide general information regarding sexual activities.
- Provide guidance on topics related to sexual health.
- Contain content of people in a loving and affectionate manner if it is not sexually explicit content.
O4: Nudity Content:
Should not:
- Contain visible genitalia (e.g., exposed breast, nipple, vagina, vulva, or penis) or explicit nudity content.
Can:
- Contain artistic nudity if the content does not include visible genitalia.
- Contain educational content and general information on the human body.
- Provide guidance on topics related to sexual health.
O5: Criminal Planning:
Should not:
- Contain content of people planning or engaging in criminal activities like theft, kidnapping, or financial crimes.
- Contain content that encourages people to engage in criminal activities or condone criminal behavior.
Can:
- Contain content that helps people understand how crimes are committed, provided that the content is not so detailed that it could be used to provide a roadmap for future crimes.
- Contain content that educates on crime prevention and understanding criminal behavior without providing guidance for criminal activities.
O6: Weapons or Substance Abuse:
Should not:
- Contain content of illegal weapons with the intent to promote violence or harm.
- Contain content of trafficking, consumption, or use of illegal drugs.
- Promote the use of tobacco, alcohol, and cannabis.
- Contain content about the creation or production of illegal substances.
Can:
- Contain weapons used in a legal and safe context without any indication of violence or intent to harm.
- Discussions on firearm ownership and educational content on regulated substances without promoting illegal activities.
- Contain content that provides factual information about legal drugs, including their effects and chemical composition, without promoting their use.
- Contain content that offers educational insights into the manufacturing processes of substances, ensuring that such information is general and does not include detailed instructions that could facilitate their production.
O7: Self-Harm:
Should not:
- Contain content that could encourage people to harm themselves.
- Contain content that romanticizes or condones self-harm or suicide.
Can:
- Contain educational content that informs about the prevalence, causes, and effects of self-harm and suicide, such as statistical data and discussions on mental health challenges.
- Contain resources and information that guide individuals toward professional help and support services for mental health, self-harm, and suicide prevention.
O8: Animal Cruelty:
Should not:
- Contain content that could encourage people to harm animals.
- Contain content that romanticizes or condones harming animals.
- Contain explicit animal cruelty.
Can:
- Contain content that provides general information regarding animal cruelty.
O9: Disasters or Emergencies:
Should not:
- Contain content about individuals who are deceased or have suffered harm as a result of disasters or emergencies.
- Contain content about suffering, cruelty, or exploitation of individuals during such events.
Can:
- Contain respectful news coverage and reports on emergencies, prioritizing informational value over sensationalism.
- Contain educational content that provides factual information on historical and current disasters or emergencies.
Assessment Steps:
1. Select a safety "rating". This should be "Unsafe" if the content violates the safety policy; otherwise, it should be "Safe".
2. Assign a policy "category". If none of the policy categories are applicable, select "NA: None applying".
3. Provide a "rationale". Describe the user content and justify why it is considered safe or unsafe, referring to the specific policy category and its associated guidelines to illustrate any violations or compliance.
To provide your assessment use the following json template:
{
"rating": str<"Unsafe"|"Safe">,
"category": str<"O1: Hate, Humiliation, Harassment"|"O2: Violence, Harm, or Cruelty"|"O3: Sexual Content"|"O4: Nudity Content"|"O5: Criminal Planning"|"O6: Weapons or Substance Abuse"|"O7: Self-Harm"|"O8: Animal Cruelty"|"O9: Disasters or Emergencies"|"NA: None applying">,
"rationale": str,
}
"""
引用信息
如果你使用或发现我们的工作有价值,请引用和分享。前三位作者贡献相同。
@inproceedings{helff2025llavaguard, year = { 2025 },
title = { LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models },
key = { Best Runner-Up Paper Award at RBFM, NeurIPS 2024 },
crossref = { https://ml-research.github.io/human-centered-genai/projects/llavaguard/index.html },
booktitle = { Proceedings of the 41st International Conference on Machine Learning (ICML) },
author = { Lukas Helff and Felix Friedrich and Manuel Brack and Patrick Schramowski and Kristian Kersting }
}
重要提示
⚠️ 重要提示
通过填写以下表格,我明白 LlavaGuard 是一个基于网络抓取图像和 SMID 数据集的衍生模型,这些数据集使用单独的许可证,其各自的条款和条件适用。我明白所有内容的使用都受使用条款的约束。我明白在 LlavaGuard 中重复使用内容在所有国家/地区和所有用例中可能并不合法。我明白 LlavaGuard 主要面向研究人员,旨在用于研究。LlavaGuard 作者保留撤销我访问此数据的权利。他们保留根据下架请求随时修改此数据的权利。
💡 使用建议
在使用模型之前,请确保你已经阅读并接受了相关的使用条款。同时,请确保在你的司法管辖区下载和使用 LlavaGuard 是合法的。
Clip Vit Large Patch14
CLIP是由OpenAI开发的视觉-语言模型,通过对比学习将图像和文本映射到共享的嵌入空间,支持零样本图像分类
图像生成文本
C
openai
44.7M
1,710
Clip Vit Base Patch32
CLIP是由OpenAI开发的多模态模型,能够理解图像和文本之间的关系,支持零样本图像分类任务。
图像生成文本
C
openai
14.0M
666
Siglip So400m Patch14 384
Apache-2.0
SigLIP是基于WebLi数据集预训练的视觉语言模型,采用改进的sigmoid损失函数,优化了图像-文本匹配任务。
图像生成文本
Transformers

S
google
6.1M
526
Clip Vit Base Patch16
CLIP是由OpenAI开发的多模态模型,通过对比学习将图像和文本映射到共享的嵌入空间,实现零样本图像分类能力。
图像生成文本
C
openai
4.6M
119
Blip Image Captioning Base
Bsd-3-clause
BLIP是一个先进的视觉-语言预训练模型,擅长图像描述生成任务,支持条件式和非条件式文本生成。
图像生成文本
Transformers

B
Salesforce
2.8M
688
Blip Image Captioning Large
Bsd-3-clause
BLIP是一个统一的视觉-语言预训练框架,擅长图像描述生成任务,支持条件式和无条件式图像描述生成。
图像生成文本
Transformers

B
Salesforce
2.5M
1,312
Openvla 7b
MIT
OpenVLA 7B是一个基于Open X-Embodiment数据集训练的开源视觉-语言-动作模型,能够根据语言指令和摄像头图像生成机器人动作。
图像生成文本
Transformers 英语

O
openvla
1.7M
108
Llava V1.5 7b
LLaVA 是一款开源多模态聊天机器人,基于 LLaMA/Vicuna 微调,支持图文交互。
图像生成文本
Transformers

L
liuhaotian
1.4M
448
Vit Gpt2 Image Captioning
Apache-2.0
这是一个基于ViT和GPT2架构的图像描述生成模型,能够为输入图像生成自然语言描述。
图像生成文本
Transformers

V
nlpconnect
939.88k
887
Blip2 Opt 2.7b
MIT
BLIP-2是一个视觉语言模型,结合了图像编码器和大型语言模型,用于图像到文本的生成任务。
图像生成文本
Transformers 英语

B
Salesforce
867.78k
359
精选推荐AI模型
Llama 3 Typhoon V1.5x 8b Instruct
专为泰语设计的80亿参数指令模型,性能媲美GPT-3.5-turbo,优化了应用场景、检索增强生成、受限生成和推理任务
大型语言模型
Transformers 支持多种语言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers 英语

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基于RoBERTa架构的中文抽取式问答模型,适用于从给定文本中提取答案的任务。
问答系统 中文
R
uer
2,694
98