t5-base-qa-summary-emotion开源模型 - 支持问答、摘要与情感检测功能

首页

T5 Base Qa Summary Emotion

由 kiri-ai 开发

基于T5架构的多功能模型，整合了问答系统、文本摘要和情感检测功能，在多个数据集上进行了微调。

大型语言模型

Transformers

英语开源协议:Apache-2.0 #多任务问答 #上下文感知 #情感检测

下载量 45

发布时间 : 3/2/2022

模型简介

该模型基于T5架构，通过CoQA、SQuAD 2、GoEmotions和CNN/DailyMail数据集微调，能够执行问答、文本摘要和情感分析任务。

模型特点

多功能集成

单一模型同时支持问答、摘要生成和情感分析三种功能

对话式问答支持

能够处理多轮对话上下文，理解前后问题关联

多数据集微调

在CoQA、SQuAD 2等多个权威数据集上进行优化

模型能力

问答系统

文本摘要

情感检测

对话理解

上下文关联分析

使用案例

智能客服

多轮对话支持

处理用户连续提问，理解问题上下文关联

在SQuAD 2开发集上F1 79.5分，CoQA开发集F1 70.6分

内容分析

新闻摘要生成

自动生成新闻文章的关键摘要

用户评论情感分析

识别文本中表达的情感倾向

🚀 T5 Base 模型：问答 + 摘要 + 情感分析

本模型融合了问答、文本摘要和情感检测功能，在多个权威数据集上进行微调训练，能为文本处理任务提供高效准确的解决方案。

🚀 快速开始

依赖项

需要 transformers>=4.0.0

✨ 主要特性

多任务支持：支持问答、文本摘要和情感检测三种任务。
微调训练：在 CoQa、Squad 2、GoEmotions 和 CNN/DailyMail 数据集上进行微调。
优秀表现：在 Squad 2 开发集上达到 F1 79.5 的分数，在 CoQa 开发集上达到 F1 70.6 的分数。

📦 安装指南

确保安装 transformers 库，版本需大于等于 4.0.0：

pip install transformers>=4.0.0

💻 使用示例

基础用法

问答任务

使用 Transformers 库

from transformers import T5ForConditionalGeneration, T5Tokenizer
model = T5ForConditionalGeneration.from_pretrained("kiri-ai/t5-base-qa-summary-emotion")
tokenizer = T5Tokenizer.from_pretrained("kiri-ai/t5-base-qa-summary-emotion")

def get_answer(question, prev_qa, context):
    input_text = [f"q: {qa[0]} a: {qa[1]}" for qa in prev_qa]
    input_text.append(f"q: {question}")
    input_text.append(f"c: {context}")
    input_text = " ".join(input_text)
    features = tokenizer([input_text], return_tensors='pt')
    tokens = model.generate(input_ids=features['input_ids'], 
            attention_mask=features['attention_mask'], max_length=64)
    return tokenizer.decode(tokens[0], skip_special_tokens=True)

print(get_answer("Why is the moon yellow?", "I'm not entirely sure why the moon is yellow.")) # unknown

context = "Elon Musk left OpenAI to avoid possible future conflicts with his role as CEO of Tesla."

print(get_answer("Why not?", [("Does Elon Musk still work with OpenAI", "No")], context)) # to avoid possible future conflicts with his role as CEO of Tesla

使用 Kiri 库

from kiri.models import T5QASummaryEmotion

context = "Elon Musk left OpenAI to avoid possible future conflicts with his role as CEO of Tesla."
prev_qa = [("Does Elon Musk still work with OpenAI", "No")]
model = T5QASummaryEmotion()

# Leave prev_qa blank for non conversational question-answering
model.qa("Why not?", context, prev_qa=prev_qa)
> "to avoid possible future conflicts with his role as CEO of Tesla"

文本摘要任务

使用 Transformers 库

from transformers import T5ForConditionalGeneration, T5Tokenizer
model = T5ForConditionalGeneration.from_pretrained("kiri-ai/t5-base-qa-summary-emotion")
tokenizer = T5Tokenizer.from_pretrained("kiri-ai/t5-base-qa-summary-emotion")

def summary(context):
    input_text = f"summarize: {context}"
    features = tokenizer([input_text], return_tensors='pt')
    tokens = model.generate(input_ids=features['input_ids'], 
            attention_mask=features['attention_mask'], max_length=64)
    return tokenizer.decode(tokens[0], skip_special_tokens=True)

使用 Kiri 库

from kiri.models import T5QASummaryEmotion

model = T5QASummaryEmotion()

model.summarise("Long text to summarise")
> "Short summary of long text"

情感检测任务

使用 Transformers 库

from transformers import T5ForConditionalGeneration, T5Tokenizer
model = T5ForConditionalGeneration.from_pretrained("kiri-ai/t5-base-qa-summary-emotion")
tokenizer = T5Tokenizer.from_pretrained("kiri-ai/t5-base-qa-summary-emotion")

def emotion(context):
    input_text = f"emotion: {context}"
    features = tokenizer([input_text], return_tensors='pt')
    tokens = model.generate(input_ids=features['input_ids'], 
            attention_mask=features['attention_mask'], max_length=64)
    return tokenizer.decode(tokens[0], skip_special_tokens=True)

使用 Kiri 库

from kiri.models import T5QASummaryEmotion

model = T5QASummaryEmotion()

model.emotion("I hope this works!")
> "optimism"

📚 详细文档

描述

该模型在 CoQa、Squad 2、GoEmotions 和 CNN/DailyMail 数据集上进行了微调。

在 Squad 2 开发集上达到了 F1 79.5 的分数，在 CoQa 开发集上达到了 F1 70.6 的分数。

文本摘要和情感检测功能尚未进行评估。

📄 许可证

本项目采用 Apache-2.0 许可证。

关于我们

Kiri 让使用最先进的模型变得简单、便捷且可扩展。

官网 | 自然语言引擎

📦 相关信息

属性	详情
模型类型	文本到文本生成
训练数据	CoQa、Squad 2、GoEmotions、CNN/DailyMail
评估指标	F1

精选推荐AI模型

Llama 3 Typhoon V1.5x 8b Instruct

专为泰语设计的80亿参数指令模型，性能媲美GPT-3.5-turbo，优化了应用场景、检索增强生成、受限生成和推理任务

Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型，专为边缘设备推理设计，体积仅为Cosmo-3B模型的2%左右。

Roberta Base Chinese Extractive Qa

基于RoBERTa架构的中文抽取式问答模型，适用于从给定文本中提取答案的任务。

智启未来，您的人工智能解决方案智库