ke-t5-base开源文本转换模型 - 免费部署支持多种NLP任务应用

首页

Ke T5 Base

由 KETI-AIR 开发

KE-T5是基于T5架构的文本到文本转换模型，由韩国电子技术研究院开发，支持多种NLP任务。

大型语言模型支持多种语言开源协议:Apache-2.0 #文本生成 #多任务统一框架 #韩语优化

下载量 3,197

发布时间 : 3/2/2022

模型简介

KE-T5基础版是基于T5架构的文本生成模型，采用统一的文本到文本框架处理各类NLP任务，包括机器翻译、文档摘要、问答和分类任务等。

模型特点

统一文本框架

采用统一的文本到文本格式处理所有NLP任务，输入输出均为文本字符串

多任务支持

可应用于机器翻译、文档摘要、问答和分类等多种NLP任务

迁移学习能力

基于T5架构，具有良好的迁移学习能力

模型能力

文本生成

机器翻译

文档摘要

问答系统

文本分类

使用案例

自然语言处理

机器翻译

将一种语言的文本转换为另一种语言

文档摘要

生成长文档的简短摘要

情感分析

分析文本的情感倾向

🚀 ke-t5-base模型卡

ke-t5-base是一个文本生成模型，它基于T5架构，可用于多种自然语言处理任务，如机器翻译、文档摘要、问答和分类等。该模型由Korea Electronics Technology Institute Artificial Intelligence Research Center共享。

🚀 快速开始

使用以下代码开始使用该模型：

点击展开

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("KETI-AIR/ke-t5-base")

model = AutoModelForSeq2SeqLM.from_pretrained("KETI-AIR/ke-t5-base")

更多示例请参阅 Hugging Face T5 文档和模型开发者创建的 Colab Notebook。

✨ 主要特性

统一文本到文本格式：T5模型将所有自然语言处理任务重构为统一的文本到文本格式，输入和输出始终是文本字符串，可使用相同的模型、损失函数和超参数处理各种NLP任务。
多语言支持：支持英语和韩语。

📚 详细文档

模型详情

模型描述

文本到文本转移变换器（T5）的开发者写道：

借助T5，我们提出将所有NLP任务重构为统一的文本到文本格式，其中输入和输出始终是文本字符串，这与只能输出类标签或输入片段的BERT风格模型形成对比。我们的文本到文本框架允许我们在任何NLP任务上使用相同的模型、损失函数和超参数。

T5-Base是一个包含2.2亿参数的检查点。

开发者：Colin Raffel、Noam Shazeer、Adam Roberts、Katherine Lee、Sharan Narang、Michael Matena、Yanqi Zhou、Wei Li、Peter J. Liu。
共享者：Korea Electronics Technology Institute Artificial Intelligence Research Center
模型类型：文本生成
语言：英语、韩语
许可证：Apache-2.0
相关模型：
- 父模型：T5
更多信息资源：

使用方式

直接使用

开发者在博客文章中写道，该模型：

我们的文本到文本框架允许我们在任何NLP任务上使用相同的模型、损失函数和超参数，包括机器翻译、文档摘要、问答和分类任务（如情感分析）。我们甚至可以通过训练T5预测数字的字符串表示而不是数字本身，将其应用于回归任务。

超出适用范围的使用

该模型不应用于故意为人们创造敌对或排斥性的环境。

偏差、风险和局限性

大量研究已经探讨了语言模型的偏差和公平性问题（例如，参见 Sheng等人（2021）和 Bender等人（2021））。该模型生成的预测可能包含针对受保护类别、身份特征以及敏感、社会和职业群体的令人不安和有害的刻板印象。

建议

用户（直接用户和下游用户）应该了解该模型的风险、偏差和局限性。关于进一步的建议，还需要更多信息。

训练详情

训练数据

该模型在 Colossal Clean Crawled Corpus (C4) 上进行预训练，该语料库是在与T5相同的研究论文背景下开发和发布的。

该模型在 无监督（1.）和有监督任务（2.）的多任务混合 上进行预训练。

更多信息请参阅 t5-base模型卡。

训练过程

预处理：暂无更多信息。
速度、大小、时间：暂无更多信息。

评估

测试数据、因素和指标

测试数据：开发者在24个任务上对模型进行了评估，完整详情请参阅研究论文。
因素：暂无更多信息。
指标：暂无更多信息。

结果

T5-Base的完整结果请参阅研究论文中的表14。

模型检查

暂无更多信息。

环境影响

可以使用 Lacoste等人（2019）提出的机器学习影响计算器来估算碳排放。

硬件类型：Google Cloud TPU Pods
使用时长：暂无更多信息。
云服务提供商：GCP
计算区域：暂无更多信息。
碳排放：暂无更多信息。

技术规格（可选）

模型架构和目标：暂无更多信息。
计算基础设施：暂无更多信息。
- 硬件：暂无更多信息。
- 软件：暂无更多信息。

引用

BibTeX

@inproceedings{kim-etal-2021-model-cross,
    title = "A Model of Cross-Lingual Knowledge-Grounded Response Generation for Open-Domain Dialogue Systems",
    author = "Kim, San  and
      Jang, Jin Yea  and
      Jung, Minyoung  and
      Shin, Saim",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-emnlp.33",
    doi = "10.18653/v1/2021.findings-emnlp.33",
    pages = "352--365",
    abstract = "Research on open-domain dialogue systems that allow free topics is challenging in the field of natural language processing (NLP). The performance of the dialogue system has been improved recently by the method utilizing dialogue-related knowledge; however, non-English dialogue systems suffer from reproducing the performance of English dialogue systems because securing knowledge in the same language with the dialogue system is relatively difficult. Through experiments with a Korean dialogue system, this paper proves that the performance of a non-English dialogue system can be improved by utilizing English knowledge, highlighting the system uses cross-lingual knowledge. For the experiments, we 1) constructed a Korean version of the Wizard of Wikipedia dataset, 2) built Korean-English T5 (KE-T5), a language model pre-trained with Korean and English corpus, and 3) developed a knowledge-grounded Korean dialogue model based on KE-T5. We observed the performance improvement in the open-domain Korean dialogue model even only English knowledge was given. The experimental results showed that the knowledge inherent in cross-lingual language models can be helpful for generating responses in open dialogue systems.",
}

@article{2020t5,
  author  = {Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu},
  title   = {Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer},
  journal = {Journal of Machine Learning Research},
  year    = {2020},
  volume  = {21},
  number  = {140},
  pages   = {1-67},
  url     = {http://jmlr.org/papers/v21/20-074.html}
}

APA

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., ... & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140), 1-67.