xlm_roberta_large-ru-sentiment-rusentimentオープンソースモデル - 高精度なロシア語の感情分析を実行

ホーム

Xlm Roberta Large Ru Sentiment Rusentiment

sismetaninによって開発

RuSentimentデータセットでファインチューニングされたXLM-RoBERTa-Largeモデルに基づくロシア語感情分析モデル

テキスト分類

Transformers

その他#ロシア語感情分析 #ソーシャルネットワークテキスト #マルチタスクファインチューニング

ダウンロード数 183

リリース時間 : 3/2/2022

モデル概要

このモデルはロシア語テキストの感情分析タスクに特化しており、ロシアのソーシャルネットワークVKontakteの一般的な投稿で優れた性能を発揮します

モデル特徴

高性能ロシア語感情分析

複数のロシア語感情分析ベンチマークで1位を獲得し、他のロシア語専用モデルを上回る性能を発揮

マルチドメイン適応能力

一般的なソーシャルネットワークテキストや特定ドメイン（銀行など）の感情分析タスクの両方で良好な性能

強力な多言語基盤モデルをベース

強力な多言語事前学習モデルであるXLM-RoBERTa-Largeをファインチューニング

モデル能力

ロシア語テキストの感情分類

ソーシャルメディアテキスト分析

マルチドメイン感情認識

使用事例

ソーシャルメディア分析

VKontakte投稿の感情分析

ロシア最大のソーシャルネットワークVKontakte上のユーザー投稿の感情傾向を分析

RuSentimentデータセットで78.31%の加重F1スコアを達成

顧客フィードバック分析

銀行分野の顧客レビュー分析

銀行関連テキストにおける顧客の感情傾向を分析

銀行分野のテストで80.89%のF1スコアを達成

🚀 XML-RoBERTa-Large-ru-sentiment-RuSentiment

XML-RoBERTa-Large-ru-sentiment-RuSentimentは、ロシア最大のソーシャルネットワークであるVKontakteの一般ドメインのロシア語投稿のRuSentimentデータセットでファインチューニングされたXML-RoBERTa-Largeモデルです。

📚 詳細ドキュメント

モデル評価結果

モデル	スコア	順位	SentiRuEval - 2016 (TC: micro F1)	SentiRuEval - 2016 (TC: macro F1)	SentiRuEval - 2016 (TC: F1)	SentiRuEval - 2016 (Banks: micro F1)	SentiRuEval - 2016 (Banks: macro F1)	SentiRuEval - 2016 (Banks: F1)	RuSentiment (wighted)	RuSentiment (F1)	KRND (F1)	LINIS Crowd (F1)	RuTweetCorp (F1)	RuReviews (F1)
SOTA	n/s		76.71	66.40	70.68	67.51	69.53	74.06	78.50	n/s	73.63	60.51	83.68	77.44
XLM - RoBERTa - Large	76.37	1	82.26	76.36	79.42	76.35	76.08	80.89	78.31	75.27	75.17	60.03	88.91	78.81
SBERT - Large	75.43	2	78.40	71.36	75.14	72.39	71.87	77.72	78.58	75.85	74.20	60.64	88.66	77.41
MBARTRuSumGazeta	74.70	3	76.06	68.95	73.04	72.34	71.93	77.83	76.71	73.56	74.18	60.54	87.22	77.51
Conversational RuBERT	74.44	4	76.69	69.09	73.11	69.44	68.68	75.56	77.31	74.40	73.10	59.95	87.86	77.78
LaBSE	74.11	5	77.00	69.19	73.55	70.34	69.83	76.38	74.94	70.84	73.20	59.52	87.89	78.47
XLM - RoBERTa - Base	73.60	6	76.35	69.37	73.42	68.45	67.45	74.05	74.26	70.44	71.40	60.19	87.90	78.28
RuBERT	73.45	7	74.03	66.14	70.75	66.46	66.40	73.37	75.49	71.86	72.15	60.55	86.99	77.41
MBART - 50 - Large - Many - to - Many	73.15	8	75.38	67.81	72.26	67.13	66.97	73.85	74.78	70.98	71.98	59.20	87.05	77.24
SlavicBERT	71.96	9	71.45	63.03	68.44	64.32	63.99	71.31	72.13	67.57	72.54	58.70	86.43	77.16
EnRuDR - BERT	71.51	10	72.56	64.74	69.07	61.44	60.21	68.34	74.19	69.94	69.33	56.55	87.12	77.95
RuDR - BERT	71.14	11	72.79	64.23	68.36	61.86	60.92	68.48	74.65	70.63	68.74	54.45	87.04	77.91
MBART - 50 - Large	69.46	12	70.91	62.67	67.24	61.12	60.25	68.41	72.88	68.63	70.52	46.39	86.48	77.52

この表は、各タスクのスコアと、それらのスコアのマクロ平均を示しており、リーダーボード上のモデルの位置を決定します。複数の評価指標を持つデータセット（例えば、RuSentimentのマクロF1と加重F1）の場合、全体のマクロ平均を計算する際に、タスクのスコアとして指標の加重平均を使用します。モデルの結果を比較するための同じ戦略がGLUEベンチマークでも適用されています。

📄 引用

このリポジトリが役立った場合は、以下の出版物を引用してください。

@article{Smetanin2021Deep,
  author = {Sergey Smetanin and Mikhail Komarov},
  title = {Deep transfer learning baselines for sentiment analysis in Russian},
  journal = {Information Processing & Management},
  volume = {58},
  number = {3},
  pages = {102484},
  year = {2021},
  issn = {0306-4573},
  doi = {0.1016/j.ipm.2020.102484}
}

データセット:

@inproceedings{rogers2018rusentiment,
  title={RuSentiment: An enriched sentiment analysis dataset for social media in Russian},
  author={Rogers, Anna and Romanov, Alexey and Rumshisky, Anna and Volkova, Svitlana and Gronas, Mikhail and Gribov, Alex},
  booktitle={Proceedings of the 27th international conference on computational linguistics},
  pages={755--763},
  year={2018}
}