đ SuperAnnotate
AI Detector - Fine-Tuned RoBERTa Large for detecting generated/synthetic text
đ Quick Start
Before using the model, you need to install the generated_text_detector
. Run the following command:
pip install git+https://github.com/superannotateai/generated_text_detector.git@v1.1.0
⨠Features
- Text Detection: Designed to detect generated/synthetic text, which is crucial for determining text authorship, ensuring the quality of training data, and detecting fraud and cheating in scientific and educational areas.
- Custom Architecture: Based on pre - trained RoBERTa, it has a custom architecture for binary sequence classification with a single output label.
- High Performance: Achieved high accuracy in detecting text generated by various LLMs, with an average accuracy of 0.852 on the validation dataset.
đĻ Installation
Run the following command to install the generated_text_detector
:
pip install git+https://github.com/superannotateai/generated_text_detector.git@v1.1.0
đģ Usage Examples
Basic Usage
from generated_text_detector.utils.model.roberta_classifier import RobertaClassifier
from generated_text_detector.utils.preprocessing import preprocessing_text
from transformers import AutoTokenizer
import torch.nn.functional as F
model = RobertaClassifier.from_pretrained("SuperAnnotate/ai-detector")
tokenizer = AutoTokenizer.from_pretrained("SuperAnnotate/ai-detector")
model.eval()
text_example = "It's not uncommon for people to develop allergies or intolerances to certain foods as they get older. It's possible that you have always had a sensitivity to lactose (the sugar found in milk and other dairy products), but it only recently became a problem for you. This can happen because our bodies can change over time and become more or less able to tolerate certain things. It's also possible that you have developed an allergy or intolerance to something else that is causing your symptoms, such as a food additive or preservative. In any case, it's important to talk to a doctor if you are experiencing new allergy or intolerance symptoms, so they can help determine the cause and recommend treatment."
text_example = preprocessing_text(text_example)
tokens = tokenizer.encode_plus(
text_example,
add_special_tokens=True,
max_length=512,
padding='longest',
truncation=True,
return_token_type_ids=True,
return_tensors="pt"
)
_, logits = model(**tokens)
proba = F.sigmoid(logits).squeeze(1).item()
print(proba)
Advanced Usage
from generated_text_detector.utils.text_detector import GeneratedTextDetector
detector = GeneratedTextDetector(
"SuperAnnotate/ai-detector",
device="cuda",
preprocessing=True
)
text_example = "It's not uncommon for people to develop allergies or intolerances to certain foods as they get older. It's possible that you have always had a sensitivity to lactose (the sugar found in milk and other dairy products), but it only recently became a problem for you. This can happen because our bodies can change over time and become more or less able to tolerate certain things. It's also possible that you have developed an allergy or intolerance to something else that is causing your symptoms, such as a food additive or preservative. In any case, it's important to talk to a doctor if you are experiencing new allergy or intolerance symptoms, so they can help determine the cause and recommend treatment."
res = detector.detect_report(text_example)
print(res)
đ Documentation
Model Details
Property |
Details |
Model Type |
A custom architecture for binary sequence classification based on pre - trained RoBERTa, with a single output label. |
Language(s) |
Primarily English. |
License |
SAIPL |
Finetuned from model |
RoBERTa Large |
Repository |
GitHub for HTTP service |
Training Data
The training dataset for this version includes 44k pairs of text - label samples, split equally between two parts:
- Custom Generation: The first half was generated using custom prompts, with human - sourced text from three domains: Wikipedia, Reddit ELI5 QA, and Scientific Papers. Texts were generated by 14 different models across four major LLM families (GPT, LLaMA, Anthropic, and Mistral).
- RAID Train Data Stratified Subset: The second half is a carefully selected stratified subset from the RAID train dataset, ensuring equal representation across domains, model types, and attack methods.
â ī¸ Important Note
Key n - grams (n ranging from 2 to 5) that exhibited the highest correlation with target labels were removed from the training data using the chi - squared test.
Training Details
Parameter |
Value |
Base Model |
FacebookAI/roberta-large |
Epochs |
20 |
Learning Rate |
5e - 05 |
Weight Decay |
0.0033 |
Label Smoothing |
0.38 |
Warmup Epochs |
2 |
Optimizer |
SGD |
Gradient Clipping |
3.0 |
Scheduler |
Cosine with hard restarts |
Number Scheduler Cycles |
6 |
Performance
The detector was validated on a stratified subset from the RAID train dataset, which includes 11 LLM models, 11 adversarial attacks, and 8 domains.
Model |
Accuracy |
Human |
0.731 |
ChatGPT |
0.992 |
GPT - 2 |
0.649 |
GPT - 3 |
0.945 |
GPT - 4 |
0.985 |
LLaMA - Chat |
0.980 |
Mistral |
0.644 |
Mistral - Chat |
0.975 |
Cohere |
0.823 |
Cohere - Chat |
0.906 |
MPT |
0.757 |
MPT - Chat |
0.943 |
Average |
0.852 |
đ License
The model is licensed under SAIPL.