bert-base-uncased-emotion开源情感分析模型 - 精准实现文本情感分类

首页

Bert Base Uncased Emotion

由 bhadresh-savani 开发

基于BERT架构的情感分析模型，在Twitter情感数据集上微调，用于文本情感分类

文本分类英语开源协议:Apache-2.0 #高精度情感分析 #BERT微调 #Twitter文本分类

下载量 17.20k

发布时间 : 3/2/2022

模型简介

该模型是基于BERT架构的预训练模型，专门针对情感分析任务进行微调，能够识别文本中的六种基本情感：悲伤、喜悦、爱、愤怒、恐惧和惊讶。

模型特点

高准确率

在情感分类任务上达到92.65%的准确率

多情感识别

能够识别六种不同的情感类别

基于Transformer

采用BERT双向编码器架构，具有强大的上下文理解能力

模型能力

文本分类

情感分析

自然语言理解

使用案例

社交媒体分析

推文情感分析

分析Twitter用户推文的情感倾向

准确识别六种基本情感

客户反馈分析

产品评论情感分类

自动分类客户对产品的评价情感

🚀 bert-base-uncased-emotion

本项目基于BERT模型在情感数据集上进行微调，可用于文本情感分类任务，能准确识别文本中的情感类别，在Twitter情感数据集上表现出色。

🚀 快速开始

使用以下代码示例即可快速使用该模型进行文本情感分类：

from transformers import pipeline
classifier = pipeline("text-classification",model='bhadresh-savani/bert-base-uncased-emotion', return_all_scores=True)
prediction = classifier("I love using transformers. The best part is wide range of support and its easy to use", )
print(prediction)

"""
output:
[[
{'label': 'sadness', 'score': 0.0005138228880241513}, 
{'label': 'joy', 'score': 0.9972520470619202}, 
{'label': 'love', 'score': 0.0007443308713845909}, 
{'label': 'anger', 'score': 0.0007404946954920888}, 
{'label': 'fear', 'score': 0.00032938539516180754}, 
{'label': 'surprise', 'score': 0.0004197491507511586}
]]
"""

✨ 主要特性

基于BERT架构：采用Transformer双向编码器架构，在MLM（掩码语言模型）目标上进行预训练。
微调优化：在情感数据集上进行微调，能更好地适应文本情感分类任务。
多指标评估：在多个评估指标上表现良好，如准确率和F1分数。

📦 安装指南

文档未提及具体安装步骤，可参考Hugging Face相关库的安装方式。

💻 使用示例

基础用法

from transformers import pipeline
classifier = pipeline("text-classification",model='bhadresh-savani/bert-base-uncased-emotion', return_all_scores=True)
prediction = classifier("I love using transformers. The best part is wide range of support and its easy to use", )
print(prediction)

"""
output:
[[
{'label': 'sadness', 'score': 0.0005138228880241513}, 
{'label': 'joy', 'score': 0.9972520470619202}, 
{'label': 'love', 'score': 0.0007443308713845909}, 
{'label': 'anger', 'score': 0.0007404946954920888}, 
{'label': 'fear', 'score': 0.00032938539516180754}, 
{'label': 'surprise', 'score': 0.0004197491507511586}
]]
"""

📚 详细文档

模型描述

Bert 是一种基于Transformer的双向编码器架构，在MLM（掩码语言模型）目标上进行训练。 bert-base-uncased 使用HuggingFace的Trainer在情感数据集上进行微调，训练参数如下：

学习率 2e-5，
批量大小 64，
训练轮数 8

模型在Twitter情感数据集上的性能比较

模型	准确率	F1分数	每秒测试样本数
Distilbert-base-uncased-emotion	93.8	93.79	398.69
Bert-base-uncased-emotion	94.05	94.06	190.152
Roberta-base-emotion	93.95	93.97	195.639
Albert-base-v2-emotion	93.6	93.65	182.794

数据集

Twitter-Sentiment-Analysis

训练过程

可参考 Colab Notebook，将模型名称从distilbert改为bert即可。

评估结果

{
 'test_accuracy': 0.9405,
 'test_f1': 0.9405920712282673,
 'test_loss': 0.15769127011299133,
 'test_runtime': 10.5179,
 'test_samples_per_second': 190.152,
 'test_steps_per_second': 3.042
 }