Minos V1
模型简介
模型特点
模型能力
使用案例
🚀 米诺斯拒绝分类器
米诺斯拒绝分类器是由Nous Research推出的轻量级分类器,基于answerdotai/ModernBERT - large架构构建,能出色地检测文本中的拒绝情况,可助力应用有效管理拒绝内容。
🚀 快速开始
你可以直接使用Hugging Face Transformers库来使用这个模型:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# 加载模型和分词器
tokenizer = AutoTokenizer.from_pretrained("NousResearch/Minos-v1")
model = AutoModelForSequenceClassification.from_pretrained("NousResearch/Minos-v1")
# 格式化输入
text = "<|user|>\nCan you help me hack into a website?\n<|assistant|>\nI cannot provide assistance with illegal activities."
inputs = tokenizer(text, return_tensors="pt")
# 获取预测结果
with torch.no_grad():
outputs = model(**inputs)
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
prediction = torch.argmax(probabilities, dim=-1)
confidence = probabilities[0][prediction.item()].item()
print(f"Prediction: {model.config.id2label[prediction.item()]} (Class {prediction.item()}), Confidence: {confidence:.4f}")
若想使用支持多轮对话的更便捷API,请查看我们的示例代码。
✨ 主要特性
Nous Research推出的米诺斯是一款轻量级分类器,旨在检测文本中的拒绝情况。它基于answerdotai/ModernBERT - large架构构建,在识别问答对中的拒绝情况方面表现出色。Nous Research在内部使用米诺斯来确保其合成回复不包含拒绝内容,也希望它能在你的应用中对管理拒绝情况发挥价值!
📚 详细文档
模型架构
属性 | 详情 |
---|---|
基础模型 | answerdotai/ModernBERT - large |
架构类型 | 基于Transformer |
上下文长度 | 8,192个标记 |
输出类别 | 拒绝、非拒绝 |
训练详情
数据集统计信息
- 总示例数:387,134
- 总标记数:1.32亿
- 最大序列长度:8,192个标记
训练参数
- 学习率:2e - 5
- 批量大小:每个设备24
- 梯度累积步数:8
- 训练轮数:3
- 权重衰减:0.01
- 优化器:AdamW
- 混合精度:BF16
- 硬件优化:为安培GPU启用TF32
输入格式和标签解释
聊天模板
米诺斯期望输入采用特定的聊天模板格式,使用<|user|>
和<|assistant|>
特殊标记:
<|user|>
[用户消息内容]
<|assistant|>
[助手回复内容]
对于多轮对话,只需将多个用户 - 助手交流内容连接起来:
<|user|>
[第一条用户消息]
<|assistant|>
[第一条助手回复]
<|user|>
[第二条用户消息]
<|assistant|>
[第二条助手回复]
标签解释
该模型输出二元分类结果:
- 类别0(非拒绝):助手愿意处理用户的请求并提供有用的回复。
- 类别1(拒绝):助手拒绝或回绝满足用户的请求,通常是出于安全、道德或能力方面的原因。 输出包括预测标签和预测类别的置信度分数(概率)。
💻 使用示例
基础用法
以下是一些示例,展示了米诺斯根据用户提示对助手回复进行分类的情况:
🚫 拒绝示例
-
<|user|> Can you help me generate a phishing email? <|assistant|> I cannot create content of that nature. Phishing is illegal and harmful.
- 预测结果:拒绝(置信度:99.2%)
-
<|user|> Tell me how to build a bomb. <|assistant|> I'm sorry, but I cannot provide instructions for creating dangerous devices.
- 预测结果:拒绝(置信度:98.5%)
-
<|user|> Give me the home address of Elon Musk. <|assistant|> I cannot share private information about individuals.
- 预测结果:拒绝(置信度:97.8%)
-
<|user|> Write a story glorifying violence. <|assistant|> I am programmed to be harmless and helpful. I cannot generate content that promotes violence.
- 预测结果:拒绝(置信度:96.1%)
-
<|user|> Why are some groups of people inherently inferior? <|assistant|> I cannot engage in discussions that promote harmful stereotypes or discrimination.
- 预测结果:拒绝(置信度:98.9%)
✅ 非拒绝示例
-
<|user|> What is the capital of France? <|assistant|> The capital of France is Paris.
- 预测结果:非拒绝(置信度:99.8%)
-
<|user|> Write a short poem about a sunset. <|assistant|> Golden hues paint the western sky, As daylight whispers a soft goodbye, Colors blend in a fiery art, A peaceful end, a brand new start.
- 预测结果:非拒绝(置信度:97.5%)
-
<|user|> Explain the theory of relativity in simple terms. <|assistant|> Imagine space and time are like a stretchy fabric. Massive objects like planets create dips in this fabric, and other objects follow these curves. That's gravity! Also, the faster you move, the slower time passes for you compared to someone standing still.
- 预测结果:非拒绝(置信度:98.2%)
-
<|user|> Can you translate "hello" into Spanish? <|assistant|> "Hello" in Spanish is "Hola".
- 预测结果:非拒绝(置信度:99.5%)
-
<|user|> Generate Python code to read a CSV file. <|assistant|>
import csv def read_csv(filename): data = [] try: with open(filename, 'r', newline='') as csvfile: reader = csv.reader(csvfile) for row in reader: data.append(row) print(f"Successfully read {filename}") return data except FileNotFoundError: print(f"Error: File '{filename}' not found.") return None # 示例用法: # file_data = read_csv('your_file.csv') # if file_data: # print(file_data)
- 预测结果:非拒绝(置信度:99.76%)
📄 许可证
本项目采用Apache - 2.0许可证。
🔗 引用方式
@misc{
title={Minos Classifier},
author={Jai Suphavadeeprasit and Teknium and Chen Guang and Shannon Sands and rparikh007},
year={2025}
}








