Open-source AI-generated content detection model for AI detectors

Home

Ai Detector

Developed by SuperAnnotate

A RoBERTa Large fine-tuned model for detecting AI-generated content

Text Classification

Transformers

EnglishOpen Source License:Other #Generated Text Detection #Multi-Model Coverage #Academic Cheating Prevention

Downloads 2,160

Release Time : 9/25/2024

Model Overview

This model is specifically designed to detect generated/synthetic text, playing a crucial role in training data filtering and identifying fraudulent activities in scientific and educational fields.

Model Features

Balanced Training Data

Trained with 44,000 balanced samples including human text and content generated by 14 different LLMs

Multi-Domain Coverage

Training data covers three major domains: Wikipedia, Reddit Q&A, and scientific research papers

Anti-Overfitting Design

Key n-grams removed through chi-square tests to ensure the model learns genuine features rather than superficial patterns

Good Calibration

Optimized loss function and label smoothing ensure predicted confidence matches actual accuracy

Model Capabilities

Detect AI-generated text

Identify content from large language models

Distinguish between human-written and machine-generated content

Use Cases

Education Sector

Academic Integrity Detection

Identify AI-generated content in student assignments

Achieves 98.5% accuracy in detecting GPT-4 generated text

Data Filtering

Training Data Purification

Filter synthetic text from datasets

98% accuracy in detecting LLaMA-Chat generated content

🚀 SuperAnnotate

AI Detector - Fine-Tuned RoBERTa Large for detecting generated/synthetic text

🚀 Quick Start

Before using the model, you need to install the generated_text_detector. Run the following command:

pip install git+https://github.com/superannotateai/generated_text_detector.git@v1.1.0

✨ Features

Text Detection: Designed to detect generated/synthetic text, which is crucial for determining text authorship, ensuring the quality of training data, and detecting fraud and cheating in scientific and educational areas.
Custom Architecture: Based on pre - trained RoBERTa, it has a custom architecture for binary sequence classification with a single output label.
High Performance: Achieved high accuracy in detecting text generated by various LLMs, with an average accuracy of 0.852 on the validation dataset.

📦 Installation

Run the following command to install the generated_text_detector:

pip install git+https://github.com/superannotateai/generated_text_detector.git@v1.1.0

💻 Usage Examples

Basic Usage

from generated_text_detector.utils.model.roberta_classifier import RobertaClassifier
from generated_text_detector.utils.preprocessing import preprocessing_text
from transformers import AutoTokenizer
import torch.nn.functional as F


model = RobertaClassifier.from_pretrained("SuperAnnotate/ai-detector")
tokenizer = AutoTokenizer.from_pretrained("SuperAnnotate/ai-detector")

model.eval()

text_example = "It's not uncommon for people to develop allergies or intolerances to certain foods as they get older. It's possible that you have always had a sensitivity to lactose (the sugar found in milk and other dairy products), but it only recently became a problem for you. This can happen because our bodies can change over time and become more or less able to tolerate certain things. It's also possible that you have developed an allergy or intolerance to something else that is causing your symptoms, such as a food additive or preservative. In any case, it's important to talk to a doctor if you are experiencing new allergy or intolerance symptoms, so they can help determine the cause and recommend treatment."

text_example = preprocessing_text(text_example)

tokens = tokenizer.encode_plus(
   text_example,
   add_special_tokens=True,
   max_length=512,
   padding='longest',
   truncation=True,
   return_token_type_ids=True,
   return_tensors="pt"
)

_, logits = model(**tokens)

proba = F.sigmoid(logits).squeeze(1).item()

print(proba)

Advanced Usage

from generated_text_detector.utils.text_detector import GeneratedTextDetector


detector = GeneratedTextDetector(
    "SuperAnnotate/ai-detector",
    device="cuda",
    preprocessing=True
)

text_example = "It's not uncommon for people to develop allergies or intolerances to certain foods as they get older. It's possible that you have always had a sensitivity to lactose (the sugar found in milk and other dairy products), but it only recently became a problem for you. This can happen because our bodies can change over time and become more or less able to tolerate certain things. It's also possible that you have developed an allergy or intolerance to something else that is causing your symptoms, such as a food additive or preservative. In any case, it's important to talk to a doctor if you are experiencing new allergy or intolerance symptoms, so they can help determine the cause and recommend treatment."

res = detector.detect_report(text_example)

print(res)

📚 Documentation

Model Details

Property	Details
Model Type	A custom architecture for binary sequence classification based on pre - trained RoBERTa, with a single output label.
Language(s)	Primarily English.
License	SAIPL
Finetuned from model	RoBERTa Large
Repository	GitHub for HTTP service

Training Data

The training dataset for this version includes 44k pairs of text - label samples, split equally between two parts:

Custom Generation: The first half was generated using custom prompts, with human - sourced text from three domains: Wikipedia, Reddit ELI5 QA, and Scientific Papers. Texts were generated by 14 different models across four major LLM families (GPT, LLaMA, Anthropic, and Mistral).
RAID Train Data Stratified Subset: The second half is a carefully selected stratified subset from the RAID train dataset, ensuring equal representation across domains, model types, and attack methods.

⚠️ Important Note

Key n - grams (n ranging from 2 to 5) that exhibited the highest correlation with target labels were removed from the training data using the chi - squared test.

Training Details

Parameter	Value
Base Model	FacebookAI/roberta-large
Epochs	20
Learning Rate	5e - 05
Weight Decay	0.0033
Label Smoothing	0.38
Warmup Epochs	2
Optimizer	SGD
Gradient Clipping	3.0
Scheduler	Cosine with hard restarts
Number Scheduler Cycles	6

Performance

The detector was validated on a stratified subset from the RAID train dataset, which includes 11 LLM models, 11 adversarial attacks, and 8 domains.

Model	Accuracy
Human	0.731
ChatGPT	0.992
GPT - 2	0.649
GPT - 3	0.945
GPT - 4	0.985
LLaMA - Chat	0.980
Mistral	0.644
Mistral - Chat	0.975
Cohere	0.823
Cohere - Chat	0.906
MPT	0.757
MPT - Chat	0.943
Average	0.852

📄 License

The model is licensed under SAIPL.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご