đ Policlim Model
A model that detects climate change salience in (political) text, offering high accuracy and F1 scores.
đ Quick Start
You can use the model for text classification, or use it as a base model to fine-tune for additional tasks. The simpletransformers
package makes this process very straightforward.
import simpletransformers
from simpletransformers.classification import ClassificationModel, ClassificationArgs
data = pd.read_csv('your_data.csv')
model = ClassificationModel(
model_type = "xlmroberta", model_name = 'policlim'
)
preds,output = model.predict(data['text'].tolist())
from sklearn.metrics import f1_score, precision, accuracy, recall
new_train = pd.read_csv('your_new_train_data.csv')
new_test = pd.read_csv('your_new_test_data.csv')
new_eval = pd.read_csv('your_new_eval_data.csv')
model = ClassificationModel(
model_type="xlmroberta",
model_name="policlim",
num_labels=2,
ignore_mismatched_sizes=True,
use_cuda=True
)
model.train_model(train_df = new_train, eval_df = new_test,
f1_train = f1_score(labels, preds,average=None)
)
result, model_outputs, wrong_predictions = model.eval_model(val_df,
f1_eval = f1_score(labels, preds,average=None),
precision = precision(labels, preds,average=None),
recall = recall(labels, preds,average=None),
acc = accuracy_score(labels, preds,average=None)
)
print('\n\nThese are the results when testing the model on the test data set:\n')
print(result)
⨠Features
- Detects climate change salience in (political) text.
- Fine - tunes base XLM - roberta using manually annotated quasi - sentences from political manifestos.
- Achieves a validation F1 score of .935 and accuracy of .957.
đ Documentation
Model Description
This model detects climate change salience in (political) text. It fine - tunes base XLM - roberta using 3,434 manually annotated quasi - sentences from political manifestos (retrieved from the Manifesteo Project Database) to detect climate change salience. The model achieves a validation F1 score of .935 and accuracy of .957.
We have used the model to classify the climate change salience of political manifestos, the first step of which is detailed in the working paper below. The paper contains all relevant details of the training set, procedure, and evaluation of the model and final dataset.
Model Sources
- Repository: https://github.com/marysanford/policlim/tree/main
- Paper: https://osf.io/preprints/osf/bq356
- Data source: https://manifesto-project.wzb.eu/
Citation Information
@techreport{sanford2024policlim,
title={Policlim: A Dataset of Climate Change Discourse in the Political Manifestos of 45 Countries from 1990 - 2022},
author={Sanford, Mary and Pianta, Silvia and Schmid, Nicolas and Musto, Giorgio},
type={Working paper},
doi={https://osf.io/preprints/osf/bq356_v4},
year={2025}
}
đ License
No license information provided in the original document.
đĨ Model Card Authors
Mary Sanford, mary.sanford@cmcc.it