DesklibオープンソースAIテキスト検出モデルv1.01 - 英文テキストが人によって書かれたものか、AIによって生成されたものかを正確に区別！

ホーム

Ai Text Detector V1.01

desklibによって開発

Desklibによって開発されたAI生成テキスト検出モデルで、人間が執筆したものとAIが生成した英語テキストを区別するために使用され、RAIDベンチマークテストでリーディングパフォーマンスを発揮しています。

テキスト分類

Transformers

英語オープンソースライセンス:MIT #AIテキスト検出 #学術的誠実性の保証 #敵対的攻撃への耐性

ダウンロード数 20.01k

リリース時間 : 2/16/2025

モデル概要

このモデルは微調整されたmicrosoft/deberta-v3-largeアーキテクチャに基づいており、AI生成テキストコンテンツの検出に特化しており、コンテンツ審査や学術的誠実性などの分野に適しています。

モデル特徴

高精度検出

RAID AI検出ベンチマークテストでリーディングパフォーマンスを発揮し、人間とAI生成のテキストを正確に区別できます。

高い頑健性

さまざまな分野の敵対的攻撃に効果的に対応し、安定した検出性能を維持します。

DeBERTaアーキテクチャベース

改良されたBERTアーキテクチャを採用し、分離注意と強化されたマスクデコーダを通じてより優れた性能を実現します。

モデル能力

AI生成テキスト検出

コンテンツの真正性検証

テキスト分類

使用事例

教育

学術的誠実性チェック

学生の課題や論文にAI生成コンテンツが含まれていないかを検出し、学術的誠実性を維持します。

教育機関が潜在的な学術不正行為を特定するのに役立ちます

コンテンツ審査

AI生成コンテンツのマーキング

ソーシャルメディアやニュースプラットフォームでAI生成コンテンツをマークし、コンテンツの透明性を向上させます。

ユーザーのコンテンツ真正性に対する信頼を強化します

ジャーナリズム

ニュースの真正性検証

ニュース記事が人間によって執筆されたものかどうかを検証し、AI生成の虚偽情報の拡散を防止します。

ニュース業界の信頼性と専門性を維持します

🚀 desklib/ai-text-detector-v1.01

このAIテキスト検出モデルは、Desklibによって開発され、英語のテキストが人間によって書かれたものか、AIによって生成されたものかを分類するように設計されています。現在、RAID Benchmark for AI Detection でトップの位置を占めています。

🚀 クイックスタート

Desklibによって開発されたこのAI生成テキスト検出モデルは、英語のテキストが人間によって書かれたものか、AIによって生成されたものかを分類するために設計されています。このモデルは、microsoft/deberta-v3-large をファインチューニングしたバージョンで、トランスフォーマーベースのアーキテクチャを利用して高い精度を達成しています。様々なドメインでの敵対的攻撃に対しても非常に強健です。このモデルは、コンテンツモデレーション、学術的誠実性、ジャーナリズムなど、テキストの信憑性が重要なアプリケーションに特に有用です。

Desklib は、パーソナライズされた学習と学習支援のためのAIベースのツールを提供しています。このモデルは、Desklibが学生、教育者、大学に提供する多くのツールの1つです。

オンラインでモデルを試してみましょう！: Desklib AI Detector

Githubリポジトリ: https://github.com/desklib/ai-text-detector

✨ 主な機能

このモデルは、提出時点でRAIDベンチマークでトップのパフォーマンスを達成しています。RAIDリーダーボードを参照

🔧 技術詳細

このモデルは、ファインチューニングされた microsoft/deberta-v3-large トランスフォーマーアーキテクチャに基づいて構築されています。主なコンポーネントは以下の通りです。

トランスフォーマーベース: 事前学習された microsoft/deberta-v3-large モデルが基礎となっています。このモデルは、DeBERTa (Decoding-enhanced BERT with disentangled attention) を利用しており、BERTとRoBERTaの改良版で、分離された注意力機構と強化されたマスクデコーダーを組み込んで、より良いパフォーマンスを実現しています。
平均プーリング: 平均プーリング層は、トランスフォーマーからの隠れ状態を集約し、入力テキストの固定サイズの表現を作成します。この方法は、トークン埋め込みを注意力マスクで重み付けして平均化し、全体的な意味を捉えます。
分類器ヘッド: 線形層が分類器として機能し、プーリングされた表現を受け取り、単一のロジットを出力します。このロジットは、入力テキストがAIによって生成されたものであるというモデルの確信度を表します。シグモイド活性化関数がロジットに適用されて、確率が生成されます。

💻 使用例

基本的な使用法

import torch
import torch.nn as nn
from transformers import AutoTokenizer, AutoConfig, AutoModel, PreTrainedModel

class DesklibAIDetectionModel(PreTrainedModel):
    config_class = AutoConfig

    def __init__(self, config):
        super().__init__(config)
        # Initialize the base transformer model.
        self.model = AutoModel.from_config(config)
        # Define a classifier head.
        self.classifier = nn.Linear(config.hidden_size, 1)
        # Initialize weights (handled by PreTrainedModel)
        self.init_weights()

    def forward(self, input_ids, attention_mask=None, labels=None):
        # Forward pass through the transformer
        outputs = self.model(input_ids, attention_mask=attention_mask)
        last_hidden_state = outputs[0]
        # Mean pooling
        input_mask_expanded = attention_mask.unsqueeze(-1).expand(last_hidden_state.size()).float()
        sum_embeddings = torch.sum(last_hidden_state * input_mask_expanded, dim=1)
        sum_mask = torch.clamp(input_mask_expanded.sum(dim=1), min=1e-9)
        pooled_output = sum_embeddings / sum_mask

        # Classifier
        logits = self.classifier(pooled_output)
        loss = None
        if labels is not None:
            loss_fct = nn.BCEWithLogitsLoss()
            loss = loss_fct(logits.view(-1), labels.float())

        output = {"logits": logits}
        if loss is not None:
            output["loss"] = loss
        return output

def predict_single_text(text, model, tokenizer, device, max_len=768, threshold=0.5):
    encoded = tokenizer(
        text,
        padding='max_length',
        truncation=True,
        max_length=max_len,
        return_tensors='pt'
    )
    input_ids = encoded['input_ids'].to(device)
    attention_mask = encoded['attention_mask'].to(device)

    model.eval()
    with torch.no_grad():
        outputs = model(input_ids=input_ids, attention_mask=attention_mask)
        logits = outputs["logits"]
        probability = torch.sigmoid(logits).item()

    label = 1 if probability >= threshold else 0
    return probability, label

def main():
    # --- Model and Tokenizer Directory ---
    model_directory = "desklib/ai-text-detector-v1.01"

    # --- Load tokenizer and model ---
    tokenizer = AutoTokenizer.from_pretrained(model_directory)
    model = DesklibAIDetectionModel.from_pretrained(model_directory)

    # --- Set up device ---
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)

    # --- Example Input text ---
    text_ai = "AI detection refers to the process of identifying whether a given piece of content, such as text, images, or audio, has been generated by artificial intelligence. This is achieved using various machine learning techniques, including perplexity analysis, entropy measurements, linguistic pattern recognition, and neural network classifiers trained on human and AI-generated data. Advanced AI detection tools assess writing style, coherence, and statistical properties to determine the likelihood of AI involvement. These tools are widely used in academia, journalism, and content moderation to ensure originality, prevent misinformation, and maintain ethical standards. As AI-generated content becomes increasingly sophisticated, AI detection methods continue to evolve, integrating deep learning models and ensemble techniques for improved accuracy."
    text_human = "It is estimated that a major part of the content in the internet will be generated by AI / LLMs by 2025. This leads to a lot of misinformation and credibility related issues. That is why if is important to have accurate tools to identify if a content is AI generated or human written"

    # --- Run prediction ---
    probability, predicted_label = predict_single_text(text_ai, model, tokenizer, device)
    print(f"Probability of being AI generated: {probability:.4f}")
    print(f"Predicted label: {'AI Generated' if predicted_label == 1 else 'Not AI Generated'}")

    probability, predicted_label = predict_single_text(text_human, model, tokenizer, device)
    print(f"Probability of being AI generated: {probability:.4f}")
    print(f"Predicted label: {'AI Generated' if predicted_label == 1 else 'Not AI Generated'}")

if __name__ == "__main__":
    main()