đ RoBERTArg
đ¤ RoBERTArg is a model designed to classify text into two labels: NON - ARGUMENT (0) and ARGUMENT (1). It was trained on a dataset of controversial topics, offering a valuable tool for argument mining.
đ Quick Start
The model has been trained to classify text based on whether it presents an argument or not. You can use it as a starting point for your research in the field of argument mining.
⨠Features
- Heterogeneous Training Data: Trained on ~25k manually - annotated sentences of controversial topics.
- Binary Classification: Capable of classifying text into NON - ARGUMENT and ARGUMENT labels.
đĻ Installation
No installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
No code examples for using the model are provided in the original document, so this section is skipped.
đ Documentation
đ¤ Model description
This model was trained on ~25k heterogeneous manually annotated sentences (đ [Stab et al. 2018](https://www.aclweb.org/anthology/D18 - 1402/)) of controversial topics to classify text into one of two labels: đˇ NON - ARGUMENT (0) and ARGUMENT (1).
đ Dataset
The dataset (đ Stab et al. 2018) consists of ARGUMENTS (~11k) that either support or oppose a topic if it includes a relevant reason for supporting or opposing the topic, or as a NON - ARGUMENT (~14k) if it does not include reasons. The authors focus on controversial topics, i.e., topics that include "an obvious polarity to the possible outcomes" and compile a final set of eight controversial topics: abortion, school uniforms, death penalty, marijuana legalization, nuclear energy, cloning, gun control, and minimum wage.
TOPIC |
ARGUMENT |
NON - ARGUMENT |
abortion |
2213 |
2,427 |
school uniforms |
325 |
1,734 |
death penalty |
325 |
2,083 |
marijuana legalization |
325 |
1,262 |
nuclear energy |
325 |
2,118 |
cloning |
325 |
1,494 |
gun control |
325 |
1,889 |
minimum wage |
325 |
1,346 |
đđŧââī¸ Model training
RoBERTArg was fine - tuned on a RoBERTA (base) pre - trained model from HuggingFace using the HuggingFace trainer with the following hyperparameters:
training_args = TrainingArguments(
num_train_epochs=2,
learning_rate=2.3102e-06,
seed=8,
per_device_train_batch_size=64,
per_device_eval_batch_size=64,
)
đ Evaluation
The model was evaluated on an evaluation set (20%):
Model |
Acc |
F1 |
R arg |
R non |
P arg |
P non |
RoBERTArg |
0.8193 |
0.8021 |
0.8463 |
0.7986 |
0.7623 |
0.8719 |
Showing the confusion matrix using again the evaluation set:
|
ARGUMENT |
NON - ARGUMENT |
ARGUMENT |
2213 |
558 |
NON - ARGUMENT |
325 |
1790 |
â ī¸ Important Note
The model can only be a starting point to dive into the exciting field of argument mining. But be aware. An argument is a complex structure, with multiple dependencies. Therefore, the model may perform less well on different topics and text types not included in the training set.
đĻ Twitter
Follow the developer on Twitter: @chklamm