🚀 MVP模型
MVP模型是一种专为自然语言生成设计的模型,它采用多任务监督预训练的方式,能适应多种生成和理解任务,为自然语言处理提供了强大的支持。
🚀 快速开始
MVP模型由田毅、李俊毅、赵文新和文继荣在论文 MVP: Multi-task Supervised Pre-training for Natural Language Generation 中提出。
详细信息和说明可查看 https://github.com/RUCAIBox/MVP。
✨ 主要特性
- 多任务适应:MVP经过有监督的预训练,使用了多种标记数据集的混合。它采用标准的Transformer编解码器架构,专门为自然语言生成而设计,可适应广泛的生成任务,包括但不限于摘要生成、数据到文本生成、开放式对话系统、故事生成、问答、问题生成、面向任务的对话系统、常识生成、释义生成、文本风格转换和文本简化。此外,该模型还能适应自然语言理解任务,如序列分类和(抽取式)问答。
💻 使用示例
基础用法
摘要生成
>>> from transformers import MvpTokenizer, MvpForConditionalGeneration
>>> tokenizer = MvpTokenizer.from_pretrained("RUCAIBox/mvp")
>>> model = MvpForConditionalGeneration.from_pretrained("RUCAIBox/mvp")
>>> inputs = tokenizer(
... "Summarize: You may want to stick it to your boss and leave your job, but don't do it if these are your reasons.",
... return_tensors="pt",
... )
>>> generated_ids = model.generate(**inputs)
>>> tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
["Why You Shouldn't Quit Your Job"]
数据到文本生成
>>> from transformers import MvpTokenizerFast, MvpForConditionalGeneration
>>> tokenizer = MvpTokenizerFast.from_pretrained("RUCAIBox/mvp")
>>> model = MvpForConditionalGeneration.from_pretrained("RUCAIBox/mvp")
>>> inputs = tokenizer(
... "Describe the following data: Iron Man | instance of | Superhero [SEP] Stan Lee | creator | Iron Man",
... return_tensors="pt",
... )
>>> generated_ids = model.generate(**inputs)
>>> tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
['Stan Lee created the character of Iron Man, a fictional superhero appearing in American comic']
📚 详细文档
相关模型
- MVP:https://huggingface.co/RUCAIBox/mvp。
- 基于提示的模型:
- MVP - 多任务:[https://huggingface.co/RUCAIBox/mvp - multi - task](https://huggingface.co/RUCAIBox/mvp - multi - task)。
- MVP - 摘要生成:[https://huggingface.co/RUCAIBox/mvp - summarization](https://huggingface.co/RUCAIBox/mvp - summarization)。
- MVP - 开放式对话:[https://huggingface.co/RUCAIBox/mvp - open - dialog](https://huggingface.co/RUCAIBox/mvp - open - dialog)。
- MVP - 数据到文本生成:[https://huggingface.co/RUCAIBox/mvp - data - to - text](https://huggingface.co/RUCAIBox/mvp - data - to - text)。
- MVP - 故事生成:[https://huggingface.co/RUCAIBox/mvp - story](https://huggingface.co/RUCAIBox/mvp - story)。
- MVP - 问答:[https://huggingface.co/RUCAIBox/mvp - question - answering](https://huggingface.co/RUCAIBox/mvp - question - answering)。
- MVP - 问题生成:[https://huggingface.co/RUCAIBox/mvp - question - generation](https://huggingface.co/RUCAIBox/mvp - question - generation)。
- MVP - 面向任务的对话:[https://huggingface.co/RUCAIBox/mvp - task - dialog](https://huggingface.co/RUCAIBox/mvp - task - dialog)。
- 多任务模型:
- MTL - 摘要生成:[https://huggingface.co/RUCAIBox/mtl - summarization](https://huggingface.co/RUCAIBox/mtl - summarization)。
- MTL - 开放式对话:[https://huggingface.co/RUCAIBox/mtl - open - dialog](https://huggingface.co/RUCAIBox/mtl - open - dialog)。
- MTL - 数据到文本生成:[https://huggingface.co/RUCAIBox/mtl - data - to - text](https://huggingface.co/RUCAIBox/mtl - data - to - text)。
- MTL - 故事生成:[https://huggingface.co/RUCAIBox/mtl - story](https://huggingface.co/RUCAIBox/mtl - story)。
- MTL - 问答:[https://huggingface.co/RUCAIBox/mtl - question - answering](https://huggingface.co/RUCAIBox/mtl - question - answering)。
- MTL - 问题生成:[https://huggingface.co/RUCAIBox/mtl - question - generation](https://huggingface.co/RUCAIBox/mtl - question - generation)。
- MTL - 面向任务的对话:[https://huggingface.co/RUCAIBox/mtl - task - dialog](https://huggingface.co/RUCAIBox/mtl - task - dialog)。
📄 许可证
本项目采用Apache - 2.0许可证。
📚 引用
如果您在研究中使用了该模型,请使用以下BibTeX引用:
@article{tang2022mvp,
title={MVP: Multi-task Supervised Pre-training for Natural Language Generation},
author={Tang, Tianyi and Li, Junyi and Zhao, Wayne Xin and Wen, Ji-Rong},
journal={arXiv preprint arXiv:2206.12131},
year={2022},
url={https://arxiv.org/abs/2206.12131},
}