ke-t5-base開源文本轉換模型 - 免費部署支持多種NLP任務應用

首頁

Ke T5 Base

由KETI-AIR開發

KE-T5是基於T5架構的文本到文本轉換模型，由韓國電子技術研究院開發，支持多種NLP任務。

大型語言模型支持多種語言開源協議:Apache-2.0 #文本生成 #多任務統一框架 #韓語優化

下載量 3,197

發布時間 : 3/2/2022

模型概述

KE-T5基礎版是基於T5架構的文本生成模型，採用統一的文本到文本框架處理各類NLP任務，包括機器翻譯、文檔摘要、問答和分類任務等。

模型特點

統一文本框架

採用統一的文本到文本格式處理所有NLP任務，輸入輸出均為文本字符串

多任務支持

可應用於機器翻譯、文檔摘要、問答和分類等多種NLP任務

遷移學習能力

基於T5架構，具有良好的遷移學習能力

模型能力

文本生成

機器翻譯

文檔摘要

問答系統

文本分類

使用案例

自然語言處理

機器翻譯

將一種語言的文本轉換為另一種語言

文檔摘要

生成長文檔的簡短摘要

情感分析

分析文本的情感傾向

🚀 ke-t5-base模型卡

ke-t5-base是一個文本生成模型，它基於T5架構，可用於多種自然語言處理任務，如機器翻譯、文檔摘要、問答和分類等。該模型由Korea Electronics Technology Institute Artificial Intelligence Research Center共享。

🚀 快速開始

使用以下代碼開始使用該模型：

點擊展開

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("KETI-AIR/ke-t5-base")

model = AutoModelForSeq2SeqLM.from_pretrained("KETI-AIR/ke-t5-base")

更多示例請參閱 Hugging Face T5 文檔和模型開發者創建的 Colab Notebook。

✨ 主要特性

統一文本到文本格式：T5模型將所有自然語言處理任務重構為統一的文本到文本格式，輸入和輸出始終是文本字符串，可使用相同的模型、損失函數和超參數處理各種NLP任務。
多語言支持：支持英語和韓語。

📚 詳細文檔

模型詳情

模型描述

文本到文本轉移變換器（T5）的開發者寫道：

藉助T5，我們提出將所有NLP任務重構為統一的文本到文本格式，其中輸入和輸出始終是文本字符串，這與只能輸出類標籤或輸入片段的BERT風格模型形成對比。我們的文本到文本框架允許我們在任何NLP任務上使用相同的模型、損失函數和超參數。

T5-Base是一個包含2.2億參數的檢查點。

開發者：Colin Raffel、Noam Shazeer、Adam Roberts、Katherine Lee、Sharan Narang、Michael Matena、Yanqi Zhou、Wei Li、Peter J. Liu。
共享者：Korea Electronics Technology Institute Artificial Intelligence Research Center
模型類型：文本生成
語言：英語、韓語
許可證：Apache-2.0
相關模型：
- 父模型：T5
更多信息資源：

使用方式

直接使用

開發者在博客文章中寫道，該模型：

我們的文本到文本框架允許我們在任何NLP任務上使用相同的模型、損失函數和超參數，包括機器翻譯、文檔摘要、問答和分類任務（如情感分析）。我們甚至可以通過訓練T5預測數字的字符串表示而不是數字本身，將其應用於迴歸任務。

超出適用範圍的使用

該模型不應用於故意為人們創造敵對或排斥性的環境。

偏差、風險和侷限性

大量研究已經探討了語言模型的偏差和公平性問題（例如，參見 Sheng等人（2021）和 Bender等人（2021））。該模型生成的預測可能包含針對受保護類別、身份特徵以及敏感、社會和職業群體的令人不安和有害的刻板印象。

建議

用戶（直接用戶和下游用戶）應該瞭解該模型的風險、偏差和侷限性。關於進一步的建議，還需要更多信息。

訓練詳情

訓練數據

該模型在 Colossal Clean Crawled Corpus (C4) 上進行預訓練，該語料庫是在與T5相同的研究論文背景下開發和發佈的。

該模型在 無監督（1.）和有監督任務（2.）的多任務混合 上進行預訓練。

更多信息請參閱 t5-base模型卡。

訓練過程

預處理：暫無更多信息。
速度、大小、時間：暫無更多信息。

評估

測試數據、因素和指標

測試數據：開發者在24個任務上對模型進行了評估，完整詳情請參閱研究論文。
因素：暫無更多信息。
指標：暫無更多信息。

結果

T5-Base的完整結果請參閱研究論文中的表14。

模型檢查

暫無更多信息。

環境影響

可以使用 Lacoste等人（2019）提出的機器學習影響計算器來估算碳排放。

硬件類型：Google Cloud TPU Pods
使用時長：暫無更多信息。
雲服務提供商：GCP
計算區域：暫無更多信息。
碳排放：暫無更多信息。

技術規格（可選）

模型架構和目標：暫無更多信息。
計算基礎設施：暫無更多信息。
- 硬件：暫無更多信息。
- 軟件：暫無更多信息。

引用

BibTeX

@inproceedings{kim-etal-2021-model-cross,
    title = "A Model of Cross-Lingual Knowledge-Grounded Response Generation for Open-Domain Dialogue Systems",
    author = "Kim, San  and
      Jang, Jin Yea  and
      Jung, Minyoung  and
      Shin, Saim",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-emnlp.33",
    doi = "10.18653/v1/2021.findings-emnlp.33",
    pages = "352--365",
    abstract = "Research on open-domain dialogue systems that allow free topics is challenging in the field of natural language processing (NLP). The performance of the dialogue system has been improved recently by the method utilizing dialogue-related knowledge; however, non-English dialogue systems suffer from reproducing the performance of English dialogue systems because securing knowledge in the same language with the dialogue system is relatively difficult. Through experiments with a Korean dialogue system, this paper proves that the performance of a non-English dialogue system can be improved by utilizing English knowledge, highlighting the system uses cross-lingual knowledge. For the experiments, we 1) constructed a Korean version of the Wizard of Wikipedia dataset, 2) built Korean-English T5 (KE-T5), a language model pre-trained with Korean and English corpus, and 3) developed a knowledge-grounded Korean dialogue model based on KE-T5. We observed the performance improvement in the open-domain Korean dialogue model even only English knowledge was given. The experimental results showed that the knowledge inherent in cross-lingual language models can be helpful for generating responses in open dialogue systems.",
}

@article{2020t5,
  author  = {Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu},
  title   = {Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer},
  journal = {Journal of Machine Learning Research},
  year    = {2020},
  volume  = {21},
  number  = {140},
  pages   = {1-67},
  url     = {http://jmlr.org/papers/v21/20-074.html}
}

APA

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., ... & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140), 1-67.