đ Advanced Suicidality Classifier Model
This project offers a machine - learning solution for detecting text sequences indicative of suicidality.
đ Quick Start
To quickly start using the Advanced Suicidality Classifier Model, first install the necessary library and then you can use the provided code snippets for text classification.
đĻ Installation
To use the model, you need to install the Transformers library:
pip install transformers
đģ Usage Examples
Basic Usage
You can utilize the model for text classification using the following code snippets:
Using the pipeline approach:
from transformers import pipeline
classifier = pipeline("sentiment-analysis", model="sentinetyd/suicidality")
result = classifier("text to classify")
print(result)
Advanced Usage
Using the tokenizer and model programmatically:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("sentinetyd/suicidality")
model = AutoModel.from_pretrained("sentinetyd/suicidality")
⨠Features
- This model can classify input text into two labels:
LABEL_0
(non - suicidal) and LABEL_1
(indicative of suicidality).
- It is fine - tuned using the ELECTRA architecture on a carefully curated dataset, achieving high performance in detecting suicidality in text.
đ Documentation
Introduction
Welcome to the Suicidality Detection AI Model! This project aims to provide a machine learning solution for detecting sequences of words indicative of suicidality in text. By utilizing the ELECTRA architecture and fine - tuning on a diverse dataset, we have created a powerful classification model that can distinguish between suicidal and non - suicidal text expressions.
Labels
The model classifies input text into two labels:
LABEL_0
: Indicates that the text is non - suicidal.
LABEL_1
: Indicates that the text is indicative of suicidality.
Training
The model was fine - tuned using the ELECTRA architecture on a carefully curated dataset. Our training process involved cleaning and preprocessing various text sources to create a comprehensive training set.
Performance
The model's performance on the validation dataset is as follows:
Property |
Details |
Accuracy |
0.939432 |
Recall |
0.937164 |
Precision |
0.92822 |
F1 Score |
0.932672 |
These metrics demonstrate the model's ability to accurately classify sequences of text as either indicative of suicidality or non - suicidal.
Data Sources
We collected data from multiple sources to create a rich and diverse training dataset:
- https://www.kaggle.com/datasets/thedevastator/c - ssrs - labeled - suicidality - in - 500 - anonymized - red
- https://www.kaggle.com/datasets/amangoyl/reddit - dataset - for - multi - task - nlp
- https://www.kaggle.com/datasets/imeshsonu/suicideal - phrases
- https://raw.githubusercontent.com/laxmimerit/twitter - suicidal - intention - dataset/master/twitter - suicidal_data.csv
- https://www.kaggle.com/datasets/mohanedmashaly/suicide - notes
- https://www.kaggle.com/datasets/natalialech/suicidal - ideation - on - twitter
The data underwent thorough cleaning and preprocessing before being used for training the model.
đ§ Technical Details
We used the ELECTRA architecture for fine - tuning. The training process included cleaning and preprocessing various text sources to build a comprehensive training set, which enabled the model to achieve good performance in text classification related to suicidality.
đ License
The model is licensed under CC0 - 1.0.
â ī¸ Important Note
Suicidality is a sensitive and serious topic. It's important to exercise caution and consider ethical implications when using this model. Predictions made by the model should be handled with care and used to complement human judgment and intervention.
đ Model Credits
We would like to acknowledge the "gooohjy/suicidal - electra" model available on Hugging Face's model repository. You can find the model at [this link](https://huggingface.co/gooohjy/suicidal - electra). We used this model as a starting point and fine - tuned it to create our specialized suicidality detection model.
đ¤ Contributions
We welcome contributions and feedback from the community to further improve the model's performance, enhance the dataset, and ensure its responsible deployment.