đ HebEMO - Emotion Recognition Model for Modern Hebrew
HebEMO is a powerful tool designed to detect polarity and extract emotions from modern Hebrew User-Generated Content (UGC). It was trained on a unique Covid-19 related dataset that we collected and annotated, achieving remarkable results.
đ Quick Start
HebEMO offers high performance in both polarity classification and emotion detection. For polarity classification, it achieved a weighted average F1 - score of 0.96. In emotion detection, most emotions reached an F1 - score between 0.78 and 0.97, except for surprise which had an F1 - score of 0.41. These results outperform the best - reported performance, even when compared to English language models.
⨠Features
Emotion UGC Data Description
Our UGC data consists of comments on news articles from 3 major Israeli news sites, collected between January 2020 and August 2020. The data is about 150 MB in size, containing over 7 million words and 350K sentences.
Approximately 2000 sentences were annotated by crowd members (3 - 10 annotators per sentence) for overall sentiment (polarity) and eight emotions: anger, disgust, anticipation, fear, joy, sadness, surprise, and trust.
The percentage of sentences in which each emotion appeared is shown in the table below.
|
anger |
disgust |
expectation |
fear |
happy |
sadness |
surprise |
trust |
sentiment |
ratio |
0.78 |
0.83 |
0.58 |
0.45 |
0.12 |
0.59 |
0.17 |
0.11 |
0.25 |
Performance
Emotion Recognition
emotion |
f1 - score |
precision |
recall |
anger |
0.96 |
0.99 |
0.93 |
disgust |
0.97 |
0.98 |
0.96 |
anticipation |
0.82 |
0.80 |
0.87 |
fear |
0.79 |
0.88 |
0.72 |
joy |
0.90 |
0.97 |
0.84 |
sadness |
0.90 |
0.86 |
0.94 |
surprise |
0.40 |
0.44 |
0.37 |
trust |
0.83 |
0.86 |
0.80 |
The above metrics are for the positive class (meaning, the emotion is reflected in the text).
Sentiment (Polarity) Analysis
|
precision |
recall |
f1 - score |
neutral |
0.83 |
0.56 |
0.67 |
positive |
0.96 |
0.92 |
0.94 |
negative |
0.97 |
0.99 |
0.98 |
accuracy |
|
|
0.97 |
macro avg |
0.92 |
0.82 |
0.86 |
weighted avg |
0.96 |
0.97 |
0.96 |
The sentiment (polarity) analysis model is also available on AWS! For more information, visit AWS' git
đģ Usage Examples
Basic Usage
Emotion Recognition Model
An online model can be found at huggingface spaces or as colab notebook
!git clone https://github.com/avichaychriqui/HeBERT.git
from HeBERT.src.HebEMO import *
HebEMO_model = HebEMO()
HebEMO_model.hebemo(input_path = 'data/text_example.txt')
hebEMO_df = HebEMO_model.hebemo(text='××××× ×פ×× ××××׊ר××', plot=True)

For sentiment classification model (polarity ONLY):
from transformers import AutoTokenizer, AutoModel, pipeline
tokenizer = AutoTokenizer.from_pretrained("avichr/heBERT_sentiment_analysis")
model = AutoModel.from_pretrained("avichr/heBERT_sentiment_analysis")
sentiment_analysis = pipeline(
"sentiment-analysis",
model="avichr/heBERT_sentiment_analysis",
tokenizer="avichr/heBERT_sentiment_analysis",
return_all_scores = True
)
sentiment_analysis('×× × ××Ē××× ×× ××××× ××ר×××Ē ×Ļ×ר×××')
>>> [[{'label': 'neutral', 'score': 0.9978172183036804},
>>> {'label': 'positive', 'score': 0.0014792329166084528},
>>> {'label': 'negative', 'score': 0.0007035882445052266}]]
sentiment_analysis('×§×¤× ×× ××ĸ××')
>>> [[{'label': 'neutral', 'score': 0.00047328314394690096},
>>> {'label': 'positive', 'score': 0.9994067549705505},
>>> {'label': 'negative', 'score': 0.00011996887042187154}]]
sentiment_analysis('×× × ×× ×××× ××Ē ××ĸ×××')
>>> [[{'label': 'neutral', 'score': 9.214012970915064e-05},
>>> {'label': 'positive', 'score': 8.876807987689972e-05},
>>> {'label': 'negative', 'score': 0.9998190999031067}]]
đ License
If you used this model, please cite us as:
Chriqui, A., & Yahav, I. (2022). HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition. INFORMS Journal on Data Science, forthcoming.
@article{chriqui2021hebert,
title={HeBERT \& HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition},
author={Chriqui, Avihay and Yahav, Inbal},
journal={INFORMS Journal on Data Science},
year={2022}
}
Contact us
Avichay Chriqui
Inbal yahav
The Coller Semitic Languages AI Lab
Thank you, ×Ē×××, Ø´ŲØąØ§