bert-base-uncased-emotion開源情感分析模型 - 精準實現文本情感分類

首頁

Bert Base Uncased Emotion

由bhadresh-savani開發

基於BERT架構的情感分析模型，在Twitter情感數據集上微調，用於文本情感分類

文本分類英語開源協議:Apache-2.0 #高精度情感分析 #BERT微調 #Twitter文本分類

下載量 17.20k

發布時間 : 3/2/2022

模型概述

該模型是基於BERT架構的預訓練模型，專門針對情感分析任務進行微調，能夠識別文本中的六種基本情感：悲傷、喜悅、愛、憤怒、恐懼和驚訝。

模型特點

高準確率

在情感分類任務上達到92.65%的準確率

多情感識別

能夠識別六種不同的情感類別

基於Transformer

採用BERT雙向編碼器架構，具有強大的上下文理解能力

模型能力

文本分類

情感分析

自然語言理解

使用案例

社交媒體分析

推文情感分析

分析Twitter用戶推文的情感傾向

準確識別六種基本情感

客戶反饋分析

產品評論情感分類

自動分類客戶對產品的評價情感

🚀 bert-base-uncased-emotion

本項目基於BERT模型在情感數據集上進行微調，可用於文本情感分類任務，能準確識別文本中的情感類別，在Twitter情感數據集上表現出色。

🚀 快速開始

使用以下代碼示例即可快速使用該模型進行文本情感分類：

from transformers import pipeline
classifier = pipeline("text-classification",model='bhadresh-savani/bert-base-uncased-emotion', return_all_scores=True)
prediction = classifier("I love using transformers. The best part is wide range of support and its easy to use", )
print(prediction)

"""
output:
[[
{'label': 'sadness', 'score': 0.0005138228880241513}, 
{'label': 'joy', 'score': 0.9972520470619202}, 
{'label': 'love', 'score': 0.0007443308713845909}, 
{'label': 'anger', 'score': 0.0007404946954920888}, 
{'label': 'fear', 'score': 0.00032938539516180754}, 
{'label': 'surprise', 'score': 0.0004197491507511586}
]]
"""

✨ 主要特性

基於BERT架構：採用Transformer雙向編碼器架構，在MLM（掩碼語言模型）目標上進行預訓練。
微調優化：在情感數據集上進行微調，能更好地適應文本情感分類任務。
多指標評估：在多個評估指標上表現良好，如準確率和F1分數。

📦 安裝指南

文檔未提及具體安裝步驟，可參考Hugging Face相關庫的安裝方式。

💻 使用示例

基礎用法

from transformers import pipeline
classifier = pipeline("text-classification",model='bhadresh-savani/bert-base-uncased-emotion', return_all_scores=True)
prediction = classifier("I love using transformers. The best part is wide range of support and its easy to use", )
print(prediction)

"""
output:
[[
{'label': 'sadness', 'score': 0.0005138228880241513}, 
{'label': 'joy', 'score': 0.9972520470619202}, 
{'label': 'love', 'score': 0.0007443308713845909}, 
{'label': 'anger', 'score': 0.0007404946954920888}, 
{'label': 'fear', 'score': 0.00032938539516180754}, 
{'label': 'surprise', 'score': 0.0004197491507511586}
]]
"""

📚 詳細文檔

模型描述

Bert 是一種基於Transformer的雙向編碼器架構，在MLM（掩碼語言模型）目標上進行訓練。 bert-base-uncased 使用HuggingFace的Trainer在情感數據集上進行微調，訓練參數如下：

學習率 2e-5，
批量大小 64，
訓練輪數 8

模型在Twitter情感數據集上的性能比較

模型	準確率	F1分數	每秒測試樣本數
Distilbert-base-uncased-emotion	93.8	93.79	398.69
Bert-base-uncased-emotion	94.05	94.06	190.152
Roberta-base-emotion	93.95	93.97	195.639
Albert-base-v2-emotion	93.6	93.65	182.794

數據集

Twitter-Sentiment-Analysis

訓練過程

可參考 Colab Notebook，將模型名稱從distilbert改為bert即可。

評估結果

{
 'test_accuracy': 0.9405,
 'test_f1': 0.9405920712282673,
 'test_loss': 0.15769127011299133,
 'test_runtime': 10.5179,
 'test_samples_per_second': 190.152,
 'test_steps_per_second': 3.042
 }