ホーム

Ai Detector

SuperAnnotateによって開発

RoBERTa Largeを微調整した生成テキスト検出モデルで、AI生成コンテンツを識別

テキスト分類

Transformers

英語オープンソースライセンス:その他 #生成テキスト検出 #マルチモデルカバレッジ #教育不正防止

ダウンロード数 2,160

リリース時間 : 9/25/2024

モデル概要

このモデルは生成/合成テキストの検出専用に設計され、トレーニングデータの選別や科学・教育分野の不正行為識別に重要な意義を持つ

モデル特徴

バランス調整済みトレーニングデータ

4.4万組の均衡サンプルでトレーニング、人間のテキストと14種のLLM生成コンテンツを含む

マルチドメインカバレッジ

トレーニングデータはWikipedia、Reddit Q&A、学術論文の3大領域を網羅

過学習防止設計

カイ二乗検定で重要n-gramを除去、表面的なパターンではなく真の特徴を学習

良好な較正性

損失関数最適化とラベル平滑化処理により、予測信頼度と実際の精度が一致

モデル能力

AI生成テキストの検出

大規模言語モデルコンテンツの識別

人間執筆と機械生成の区別

使用事例

教育分野

学術誠実性検出

学生課題のAI生成コンテンツを識別

GPT-4生成テキスト検出精度98.5%を達成

データ選別

トレーニングデータ浄化

データセット内の合成テキストをフィルタリング

LLaMA-Chat生成コンテンツ検出精度98%

🚀 SuperAnnotate

このモデルは、生成された合成テキストを検出するために設計されています。現在、この機能は、テキストの作成者を特定するために非常に重要です。トレーニングデータ、科学および教育分野における不正行為の検出にも役立ちます。

SuperAnnotate Logo

SuperAnnotate

AI Detector
Fine-Tuned RoBERTa Large

📚 ドキュメント

このモデルは、生成された/合成テキストを検出するように設計されています。現時点では、このような機能は、テキストの作成者を特定するために重要です。トレーニングデータ、科学および教育分野における不正行為の検出にも重要です。この問題に関する記事: Problems with Synthetic Data | Risk of LLMs in Education

🔧 技術詳細

モデルの説明

プロパティ	詳細
モデルタイプ	事前学習済みのRoBERTaをベースにした二値シーケンス分類用のカスタムアーキテクチャで、単一の出力ラベルを持ちます。
言語	主に英語。
ライセンス	SAIPL
ファインチューニング元のモデル	RoBERTa Large

モデルのソース

リポジトリ: HTTPサービス用のGitHub

トレーニングデータ

このバージョンのトレーニングデータセットには、44kのテキスト-ラベルサンプルのペアが含まれており、2つの部分に均等に分割されています。

カスタム生成: データセットの前半は、カスタムで特別に設計されたプロンプトを使用して生成され、ヒューマンバージョンは3つのドメインから取得されました。
- Wikipedia
- Reddit ELI5 QA
- Scientific Papers (セクションの全文を含むように拡張)
テキストは、4つの主要なLLMファミリー（GPT、LLaMA、Anthropic、およびMistral）の14の異なるモデルによって生成されました。各サンプルは、単一のプロンプトと1つの人間が書いた応答および1つの生成された応答のペアで構成されていますが、プロンプトはトレーニング入力から除外されています。
RAIDトレーニングデータの層化サブセット: 後半は、RAIDトレーニングデータセットから慎重に選択された層化サブセットで、ドメイン、モデルタイプ、および攻撃方法全体で均等な表現が確保されています。各例は、人間が作成したテキストと、対応する機械生成の応答（特定のパラメータと攻撃が適用された単一のモデルによって生成された）のペアです。

このバランスの取れたデータセット構造により、人間と生成されたテキストサンプルの割合がほぼ均等に維持され、各プロンプトが1つの本物の答えと1つの生成された答えに対応しています。

⚠️ 重要提示

さらに、カイ二乗検定を利用して、ターゲットラベルと最も高い相関を示す重要なn-gram（nは2から5の範囲）を特定し、その後トレーニングデータから削除しました。

特徴

トレーニング中の優先事項の1つは、予測の質を最大化するだけでなく、過学習を回避し、適切に自信のある予測器を得ることでした。私たちは、以下のようなモデルのキャリブレーションと高い予測精度を達成できたことを喜びます。

📦 インストール

事前要件: generated_text_detectorをインストールします。以下のコマンドを実行します。pip install git+https://github.com/superannotateai/generated_text_detector.git@v1.1.0

💻 使用例

基本的な使用法

from generated_text_detector.utils.model.roberta_classifier import RobertaClassifier
from generated_text_detector.utils.preprocessing import preprocessing_text
from transformers import AutoTokenizer
import torch.nn.functional as F


model = RobertaClassifier.from_pretrained("SuperAnnotate/ai-detector")
tokenizer = AutoTokenizer.from_pretrained("SuperAnnotate/ai-detector")

model.eval()

text_example = "It's not uncommon for people to develop allergies or intolerances to certain foods as they get older. It's possible that you have always had a sensitivity to lactose (the sugar found in milk and other dairy products), but it only recently became a problem for you. This can happen because our bodies can change over time and become more or less able to tolerate certain things. It's also possible that you have developed an allergy or intolerance to something else that is causing your symptoms, such as a food additive or preservative. In any case, it's important to talk to a doctor if you are experiencing new allergy or intolerance symptoms, so they can help determine the cause and recommend treatment."

text_example = preprocessing_text(text_example)

tokens = tokenizer.encode_plus(
   text_example,
   add_special_tokens=True,
   max_length=512,
   padding='longest',
   truncation=True,
   return_token_type_ids=True,
   return_tensors="pt"
)

_, logits = model(**tokens)

proba = F.sigmoid(logits).squeeze(1).item()

print(proba)

高度な使用法

from generated_text_detector.utils.text_detector import GeneratedTextDetector


detector = GeneratedTextDetector(
    "SuperAnnotate/ai-detector",
    device="cuda",
    preprocessing=True
)

text_example = "It's not uncommon for people to develop allergies or intolerances to certain foods as they get older. It's possible that you have always had a sensitivity to lactose (the sugar found in milk and other dairy products), but it only recently became a problem for you. This can happen because our bodies can change over time and become more or less able to tolerate certain things. It's also possible that you have developed an allergy or intolerance to something else that is causing your symptoms, such as a food additive or preservative. In any case, it's important to talk to a doctor if you are experiencing new allergy or intolerance symptoms, so they can help determine the cause and recommend treatment."

res = detector.detect_report(text_example)

print(res)

🔧 トレーニングの詳細

カスタムアーキテクチャは、二値分類を実行し、単一のモデル出力を提供する能力、および損失関数に組み込まれた平滑化のカスタマイズ可能な設定のために選択されました。

トレーニング引数:

ベースモデル: FacebookAI/roberta-large
エポック数: 20
学習率: 5e-05
重み減衰: 0.0033
ラベル平滑化: 0.38
ウォームアップエポック数: 2
オプティマイザ: SGD
勾配クリッピング: 3.0
スケジューラ: ハードリスタート付きのコサイン
スケジューラサイクル数: 6

🔍 性能

このソリューションは、RAIDトレーニングデータセットの層化サブセットで検証されています。このベンチマークには、以下をカバーする多様なデータセットが含まれています。

11のLLMモデル
11の敵対的攻撃
8つのドメイン

検出器の性能

モデル	精度
Human	0.731
ChatGPT	0.992
GPT-2	0.649
GPT-3	0.945
GPT-4	0.985
LLaMA-Chat	0.980
Mistral	0.644
Mistral-Chat	0.975
Cohere	0.823
Cohere-Chat	0.906
MPT	0.757
MPT-Chat	0.943
平均	0.852