đ HebEMO - Emotion Recognition Model for Modern Hebrew
HebEMO is a powerful tool designed to detect polarity and extract emotions from modern Hebrew User-Generated Content (UGC). It was trained on a unique Covid-19 related dataset that we collected and annotated, delivering outstanding performance in both polarity classification and emotion detection.
đ Quick Start
HebEMO offers straightforward ways to get started. You can access an online model at huggingface spaces or as colab notebook. Here is a basic example of how to use it:
!git clone https://github.com/avichaychriqui/HeBERT.git
from HeBERT.src.HebEMO import *
HebEMO_model = HebEMO()
HebEMO_model.hebemo(input_path = 'data/text_example.txt')
hebEMO_df = HebEMO_model.hebemo(text='××××× ×פ×× ××××׊ר××', plot=True)
⨠Features
- High Performance: HebEMO achieved a high weighted average F1-score of 0.96 for polarity classification. In emotion detection, it reached an F1-score of 0.78 - 0.97, except for surprise (F1 = 0.41).
- Unique Dataset: Trained on a specially collected and annotated Covid-19 related dataset of modern Hebrew UGC.
- Multi - Emotion Detection: Capable of detecting eight emotions: anger, disgust, anticipation, fear, joy, sadness, surprise, and trust, along with overall sentiment (polarity).
đ Documentation
Emotion UGC Data Description
Our UGC data consists of comments on news articles from three major Israeli news sites, collected between January 2020 and August 2020. The data is about 150 MB, containing over 7 million words and 350K sentences.
Approximately 2000 sentences were annotated by crowd members (3 - 10 annotators per sentence) for overall sentiment (polarity) and eight emotions. The table below shows the percentage of sentences in which each emotion appeared:
|
anger |
disgust |
anticipation |
fear |
joy |
sadness |
surprise |
trust |
sentiment |
ratio |
0.78 |
0.83 |
0.58 |
0.45 |
0.12 |
0.59 |
0.17 |
0.11 |
0.25 |
Performance
Emotion Recognition
emotion |
f1-score |
precision |
recall |
anger |
0.96 |
0.99 |
0.93 |
disgust |
0.97 |
0.98 |
0.96 |
anticipation |
0.82 |
0.80 |
0.87 |
fear |
0.79 |
0.88 |
0.72 |
joy |
0.90 |
0.97 |
0.84 |
sadness |
0.90 |
0.86 |
0.94 |
surprise |
0.40 |
0.44 |
0.37 |
trust |
0.83 |
0.86 |
0.80 |
The above metrics are for the positive class (meaning, the emotion is reflected in the text).
Sentiment (Polarity) Analysis
|
precision |
recall |
f1-score |
neutral |
0.83 |
0.56 |
0.67 |
positive |
0.96 |
0.92 |
0.94 |
negative |
0.97 |
0.99 |
0.98 |
accuracy |
|
|
0.97 |
macro avg |
0.92 |
0.82 |
0.86 |
weighted avg |
0.96 |
0.97 |
0.96 |
The sentiment (polarity) analysis model is also available on AWS! For more information, visit AWS' git.
Usage for Different Models
Emotion Recognition Model
The code example above shows how to use the emotion recognition model. It returns an analyzed pandas.DataFrame
.
Sentiment Classification Model (Polarity ONLY)
from transformers import AutoTokenizer, AutoModel, pipeline
tokenizer = AutoTokenizer.from_pretrained("avichr/heBERT_sentiment_analysis")
model = AutoModel.from_pretrained("avichr/heBERT_sentiment_analysis")
sentiment_analysis = pipeline(
"sentiment-analysis",
model="avichr/heBERT_sentiment_analysis",
tokenizer="avichr/heBERT_sentiment_analysis",
return_all_scores = True
)
sentiment_analysis('×× × ××Ē××× ×× ××××× ××ר×××Ē ×Ļ×ר×××')
>>> [[{'label': 'neutral', 'score': 0.9978172183036804},
>>> {'label': 'positive', 'score': 0.0014792329166084528},
>>> {'label': 'negative', 'score': 0.0007035882445052266}]]
sentiment_analysis('×§×¤× ×× ××ĸ××')
>>> [[{'label': 'neutral', 'score': 0.00047328314394690096},
>>> {'label': 'positive', 'score': 0.9994067549705505},
>>> {'label': 'negative', 'score': 0.00011996887042187154}]]
sentiment_analysis('×× × ×× ×××× ××Ē ××ĸ×××')
>>> [[{'label': 'neutral', 'score': 9.214012970915064e-05},
>>> {'label': 'positive', 'score': 8.876807987689972e-05},
>>> {'label': 'negative', 'score': 0.9998190999031067}]]
đ License
If you use this model, please cite us as follows:
Chriqui, A., & Yahav, I. (2022). HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition. INFORMS Journal on Data Science, forthcoming.
@article{chriqui2021hebert,
title={HeBERT \& HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition},
author={Chriqui, Avihay and Yahav, Inbal},
journal={INFORMS Journal on Data Science},
year={2022}
}
đ Contact Us