đ HebEMO - Emotion Recognition Model for Modern Hebrew
HebEMO is a powerful tool designed to detect polarity and extract emotions from modern Hebrew User-Generated Content (UGC). It was trained on a unique Covid-19 related dataset that we collected and annotated, delivering outstanding performance.
đ Quick Start
HebEMO offers seamless integration for emotion recognition and sentiment classification. You can access the online model through huggingface spaces or as a colab notebook.
đģ Usage Examples
Basic Usage
!git clone https://github.com/avichaychriqui/HeBERT.git
from HeBERT.src.HebEMO import *
HebEMO_model = HebEMO()
HebEMO_model.hebemo(input_path = 'data/text_example.txt')
hebEMO_df = HebEMO_model.hebemo(text='××××× ×פ×× ××××׊ר××', plot=True)

Advanced Usage
For sentiment classification model (polarity ONLY):
from transformers import AutoTokenizer, AutoModel, pipeline
tokenizer = AutoTokenizer.from_pretrained("avichr/heBERT_sentiment_analysis")
model = AutoModel.from_pretrained("avichr/heBERT_sentiment_analysis")
sentiment_analysis = pipeline(
"sentiment-analysis",
model="avichr/heBERT_sentiment_analysis",
tokenizer="avichr/heBERT_sentiment_analysis",
return_all_scores = True
)
sentiment_analysis('×× × ××Ē××× ×× ××××× ××ר×××Ē ×Ļ×ר×××')
>>> [[{'label': 'neutral', 'score': 0.9978172183036804},
>>> {'label': 'positive', 'score': 0.0014792329166084528},
>>> {'label': 'negative', 'score': 0.0007035882445052266}]]
sentiment_analysis('×§×¤× ×× ××ĸ××')
>>> [[{'label': 'neutral', 'score': 0.00047328314394690096},
>>> {'label': 'possitive', 'score': 0.9994067549705505},
>>> {'label': 'negetive', 'score': 0.00011996887042187154}]]
sentiment_analysis('×× × ×× ×××× ××Ē ××ĸ×××')
>>> [[{'label': 'neutral', 'score': 9.214012970915064e-05},
>>> {'label': 'possitive', 'score': 8.876807987689972e-05},
>>> {'label': 'negetive', 'score': 0.9998190999031067}]]
⨠Features
- High Performance: HebEMO achieved a weighted average F1-score of 0.96 for polarity classification. Emotion detection reached an F1-score of 0.78 - 0.97, outperforming the best-reported results, even when compared to English language models.
- Unique Dataset: Trained on a specially collected and annotated Covid-19 related UGC dataset from major Israeli news sites.
- Multi-Emotion Detection: Capable of detecting eight emotions: anger, disgust, anticipation, fear, joy, sadness, surprise, and trust, along with overall sentiment (polarity).
đĻ Installation
!pip install pyplutchik==0.0.7
!pip install transformers==4.14.1
!git clone https://github.com/avichaychriqui/HeBERT.git
đ Documentation
Emotion UGC Data Description
Our UGC data consists of comments on news articles from three major Israeli news sites, collected between January 2020 and August 2020. The dataset is approximately 150 MB, containing over 7 million words and 350K sentences.
Around 2000 sentences were annotated by crowd members (3 - 10 annotators per sentence) for overall sentiment (polarity) and eight emotions. The table below shows the percentage of sentences in which each emotion appeared.
|
anger |
disgust |
expectation |
fear |
happy |
sadness |
surprise |
trust |
sentiment |
ratio |
0.78 |
0.83 |
0.58 |
0.45 |
0.12 |
0.59 |
0.17 |
0.11 |
0.25 |
Performance
Emotion Recognition
emotion |
f1-score |
precision |
recall |
anger |
0.96 |
0.99 |
0.93 |
disgust |
0.97 |
0.98 |
0.96 |
anticipation |
0.82 |
0.80 |
0.87 |
fear |
0.79 |
0.88 |
0.72 |
joy |
0.90 |
0.97 |
0.84 |
sadness |
0.90 |
0.86 |
0.94 |
surprise |
0.40 |
0.44 |
0.37 |
trust |
0.83 |
0.86 |
0.80 |
The above metrics are for the positive class (meaning, the emotion is reflected in the text).
Sentiment (Polarity) Analysis
|
precision |
recall |
f1-score |
neutral |
0.83 |
0.56 |
0.67 |
positive |
0.96 |
0.92 |
0.94 |
negative |
0.97 |
0.99 |
0.98 |
accuracy |
|
|
0.97 |
macro avg |
0.92 |
0.82 |
0.86 |
weighted avg |
0.96 |
0.97 |
0.96 |
The sentiment (polarity) analysis model is also available on AWS! For more information, visit AWS' git
đ License
Please cite our work if you use this model:
Chriqui, A., & Yahav, I. (2022). HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition. INFORMS Journal on Data Science, forthcoming.
@article{chriqui2021hebert,
title={HeBERT \& HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition},
author={Chriqui, Avihay and Yahav, Inbal},
journal={INFORMS Journal on Data Science},
year={2022}
}
đ Contact Us
Thank you, ×Ē×××, Ø´ŲØąØ§