flan-t5-tsa-thor-xl开源模型 - 精准分析英文文本目标情感

首页

Flan T5 Tsa Thor Xl

由 nicolay-r 开发

基于Flan-T5-XL微调的目标情感分析模型，采用三跳推理思维链(THoR)框架训练，专为英文文本设计

大型语言模型

Transformers

英语开源协议:MIT #目标情感分析 #思维链推理 #多跳推理

下载量 34

发布时间 : 6/2/2024

模型简介

该模型用于目标情感分析任务，能够预测给定句子中针对特定实体的情感极性（积极/消极/中性）

模型特点

三跳推理思维链

采用THoR框架进行训练，通过多步推理提高情感分析准确性

目标情感分析

能够精确分析句子中对特定实体的情感倾向

批量处理支持

2025年更新后支持批量模式处理

模型能力

文本情感分析

目标实体情感识别

多步推理

使用案例

情感分析

社交媒体情感监测

分析社交媒体内容中对特定人物/品牌的情感倾向

在RuSentNE-2023测试集上F1_PN达到60.024

产品评论分析

识别评论中对特定产品特征的情感极性

🚀 模型ID的模型卡片

本模型专为英文文本设计，基于Flan - T5架构，在目标情感分析（TSA）任务上进行了微调。它能根据给定的输入句子和其中提到的实体（目标），预测作者的情感状态，输出positive、negative或neutral三种类别之一。

🚀 快速开始

直接使用

以下是使用该模型进行推理的三个快速步骤：

1. 加载模型和分词器

import torch
from transformers import AutoTokenizer, T5ForConditionalGeneration

# Setup model path.
model_path = "nicolay-r/flan-t5-tsa-thor-xl"
# Setup device.
device = "cuda:0"

model = T5ForConditionalGeneration.from_pretrained(model_path, torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_path)
model.to(device)

2. 设置生成大语言模型响应的询问方法

def ask(prompt):
  inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
  inputs.to(device)
  output = model.generate(**inputs, temperature=1)
  return tokenizer.batch_decode(output, skip_special_tokens=True)[0]

3. 设置思维链

def target_sentiment_extraction(sentence, target):
  # Setup labels.
  labels_list = ['neutral', 'positive', 'negative']
  # Setup Chain-of-Thought
  step1 = f"Given the sentence {sentence}, which specific aspect of {target} is possibly mentioned?"
  aspect = ask(step1)
  step2 = f"{step1}. The mentioned aspect is about {aspect}. Based on the common sense, what is the implicit opinion towards the mentioned aspect of {target}, and why?"
  opinion = ask(step2)
  step3 = f"{step2}. The opinion towards the mentioned aspect of {target} is {opinion}. Based on such opinion, what is the sentiment polarity towards {target}?"
  emotion_state = ask(step3)
  step4 = f"{step3}. The sentiment polarity is {emotion_state}. Based on these contexts, summarize and return the sentiment polarity only, " + "such as: {}.".format(", ".join(labels_list))
  # Return the final response.
  return ask(step4)

最后，你可以按如下方式推断模型结果：

# Input sentence.
sentence = "Over the past 28 years, the leader has been working hard to achieve the release of Peltier and is a member of the Leonard Peltier Defense Committee."
# Input target.
target = "Peltier"
# output response
flant5_response = target_sentiment_extraction(sentence, target)
print(f"Author opinion towards `{target}` in `{sentence}` is:\n{flant5_response}")

模型的响应如下：

Author opinion towards "Peltier" in "Over ..." is: positive

下游使用

请参考 Reasoning - for - Sentiment - Analysis 框架的相关部分。

此示例将该模型以THoR模式应用于RuSentNE - 2023竞赛的验证数据进行评估。

python thor_finetune.py -m "nicolay-r/flan-t5-tsa-thor-xl" -r "thor" -d "rusentne2023" -z -bs 4 -f "./config/config.yaml"

可参考 Google Colab Notebook 进行实现复现。

超出适用范围的使用

该模型是Flan - T5在RuSentNE - 2023数据集上的微调版本。由于数据集的输出答案为三分类（positive、negative、neutral），模型的行为可能会偏向于该特定任务。

建议

用户（包括直接使用和下游使用）应了解模型的风险、偏差和局限性。如需进一步建议，还需要更多信息。

✨ 主要特性

基于Flan - T5架构，在目标情感分析任务上进行微调。
专为英文文本设计，输入给定句子和目标实体，输出作者的情感状态。

📦 安装指南

暂未提及具体安装步骤，可参考代码中的依赖库安装，如torch和transformers。

📚 详细文档

模型详情

2025年2月23日更新：🔥 支持批处理模式。请参阅 🌌 Flan - T5提供程序以了解 bulk - chain 项目。测试在此处

该模型是基于思维链调优版本的Flan - T5，用于目标情感分析（TSA）任务，使用了RuSentNE - 2023集合的训练数据。

该模型专为英文文本设计。由于原始集合为非英文文本，内容已使用[googletrans]自动翻译成英文。

对于给定的输入句子和其中提到的实体（目标），该模型通过回答以下类别之一来预测作者的状态： [positive, negative, neutral]

模型描述

开发者：由 nicolay - r 改进，初始实现归功于 scofield7419
模型类型：Flan - T5
语言（NLP）：英文
许可证：Apache License 2.0

模型来源

仓库：Reasoning - for - Sentiment - Analysis - Framework
论文：https://arxiv.org/abs/2404.12342
演示：我们有一个在Google - Colab上运行相关模型的代码

🔧 技术细节

训练数据

我们使用了train数据，这些数据已使用GoogleTransAPI自动翻译成英文。文本的原始来源为俄语，来自以下仓库： https://github.com/dialogue-evaluation/RuSentNE-evaluation

英文数据集的翻译版本可通过以下脚本自动下载： https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework/blob/main/rusentne23_download.py

训练过程

该模型使用了三跳推理框架进行训练，该框架在论文中提出： https://arxiv.org/abs/2305.11255

训练过程使用了该框架的改进版本： https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework

用于复现的Google - colab笔记本： https://colab.research.google.com/github/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework/blob/main/Reasoning_for_Sentiment_Analysis_Framework.ipynb

设置：Flan - T5 - xl 最大处理 64个标记，批量大小为4。

GPU：NVidia - A100，bfloat16，约30分钟/轮次

整个训练过程共进行了3轮次。

image/png

训练超参数

训练方案：所有配置细节在相关的配置文件中突出显示。

🔍 评估

测试数据、因素和指标

测试数据

test评估数据的直接链接： https://github.com/dialogue-evaluation/RuSentNE-evaluation/blob/main/final_data.csv

指标

模型评估使用了两个指标：

F1_PN -- 对positive和negative类别的F1度量；
F1_PN0 -- 对positive、negative和neutral类别的F1度量；

结果

该模型的测试评估显示 F1_PN = 60.024

以下是训练过程的日志，展示了在RuSentNE - 2023 test集上经过4轮次训练后的最终性能（第5 - 6行）：

  F1_PN  F1_PN0  default   mode
0  66.678  73.838   73.838  valid
1  68.019  74.816   74.816  valid
2  67.870  74.688   74.688  valid
3  65.090  72.449   72.449   test
4  65.090  72.449   72.449   test

📄 许可证

Apache License 2.0

📋 模型信息表格

属性	详情
模型类型	Flan - T5
训练数据	使用了RuSentNE - 2023集合的训练数据，原始为俄语，已自动翻译成英文。
许可证	Apache License 2.0