MT0-XL开源多语言文本生成模型 - 免费支持指代消解与自然语言推理

首页

Mt0 Xl

由 bigscience 开发

MT0-XL 是一个多语言文本生成模型，支持多种语言和任务，包括指代消解和自然语言推理。

大型语言模型

Transformers

支持多种语言开源协议:Apache-2.0 #多语言指代消解 #跨语言自然语言推理 #低资源语言支持

下载量 2,807

发布时间 : 10/27/2022

模型简介

MT0-XL 是一个基于 Transformer 架构的多语言模型，适用于多种自然语言处理任务，如指代消解、自然语言推理等。

模型特点

多语言支持

支持超过 100 种语言，适用于全球范围内的自然语言处理任务。

多任务处理

能够执行多种自然语言处理任务，包括指代消解和自然语言推理。

高性能

在多个基准测试中表现出色，尤其是在指代消解任务中。

模型能力

文本生成

指代消解

自然语言推理

使用案例

自然语言处理

指代消解

用于解决文本中的指代问题，提高文本理解的准确性。

在 Winogrande XL 数据集上准确率为 52.49%。

自然语言推理

用于判断两个句子之间的逻辑关系（如蕴含、矛盾或中立）。

在 ANLI 数据集上的准确率分别为 38.2%（r1）、34.8%（r2）和 39%（r3）。

🚀 BLOOMZ & mT0模型介绍

我们推出了BLOOMZ和mT0系列模型，这些模型能够在零样本的情况下，以数十种语言遵循人类指令。我们在跨语言任务混合数据集（xP3）上对BLOOM和mT5预训练的多语言语言模型进行微调，发现得到的模型能够对未见的任务和语言进行跨语言泛化。

🚀 快速开始

模型使用建议

我们建议使用该模型执行自然语言表达的任务。例如，给定提示“Translate to English: Je t’aime.”，模型很可能会回答“I love you.”。以下是论文中的一些提示示例：

一个传奇的开端，一个不灭的神话，这不仅仅是一部电影，而是作为一个走进新时代的标签，永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评?
Suggest at least five related search terms to "Mạng neural nhân tạo".
Write a fairy tale about a troll saving a princess from a dangerous dragon. The fairy tale is a masterpiece that has achieved praise worldwide and its moral is "Heroes Come in All Shapes and Sizes". Story (in Spanish):
Explain in a sentence in Telugu what is backpropagation in neural networks.

欢迎在社区标签中分享你的生成结果！

使用代码示例

CPU

点击展开

# pip install -q transformers
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

checkpoint = "bigscience/mt0-xl"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)

inputs = tokenizer.encode("Translate to English: Je t’aime.", return_tensors="pt")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

GPU

点击展开

# pip install -q transformers accelerate
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

checkpoint = "bigscience/mt0-xl"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint, torch_dtype="auto", device_map="auto")

inputs = tokenizer.encode("Translate to English: Je t’aime.", return_tensors="pt").to("cuda")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

GPU in 8bit

点击展开

# pip install -q transformers accelerate bitsandbytes
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

checkpoint = "bigscience/mt0-xl"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint, device_map="auto", load_in_8bit=True)

inputs = tokenizer.encode("Translate to English: Je t’aime.", return_tensors="pt").to("cuda")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

✨ 主要特性

能够在零样本的情况下，以数十种语言遵循人类指令。
对未见的任务和语言具有跨语言泛化能力。

📦 安装指南

使用该模型需要安装transformers库，根据不同的使用场景，可能还需要安装accelerate和bitsandbytes库。安装命令如下：

# CPU使用
pip install -q transformers

# GPU使用
pip install -q transformers accelerate

# GPU in 8bit使用
pip install -q transformers accelerate bitsandbytes

🔧 技术细节

模型

架构：与mt5-xl相同，也可参考config.json文件。
微调步骤：10000步。
微调令牌数：18.5亿。
精度：bfloat16。

硬件

TPU：TPUv4 - 128。

软件

编排：T5X。
神经网络：Jax。

📚 详细文档

模型概述

我们推出的BLOOMZ和mT0系列模型，能够在零样本的情况下，以数十种语言遵循人类指令。通过在跨语言任务混合数据集（xP3）上对BLOOM和mT5预训练的多语言语言模型进行微调，模型展现出了跨语言泛化能力。

模型使用

建议使用该模型执行自然语言表达的任务，性能可能会因提示不同而有所差异。对于BLOOMZ模型，建议明确输入的结束位置，避免模型继续输入内容。同时，应尽可能为模型提供更多的上下文信息。

模型训练

模型架构与mt5-xl相同，经过10000步微调，使用18.5亿个令牌，精度为bfloat16。训练使用TPUv4 - 128硬件，借助T5X进行编排，基于Jax构建神经网络。

模型评估

零样本在未见任务上的结果可参考论文Crosslingual Generalization through Multitask Finetuning中的表7以及bigscience/evaluation-results。侧边栏报告了每个数据集配置下最佳提示的零样本性能。

📄 许可证

本项目采用apache - 2.0许可证。

相关信息

数据集

支持语言

该模型支持以下语言：af, am, ar, az, be, bg, bn, ca, ceb, co, cs, cy, da, de, el, en, eo, es, et, eu, fa, fi, fil, fr, fy, ga, gd, gl, gu, ha, haw, hi, hmn, ht, hu, hy, ig, is, it, iw, ja, jv, ka, kk, km, kn, ko, ku, ky, la, lb, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, 'no', ny, pa, pl, ps, pt, ro, ru, sd, si, sk, sl, sm, sn, so, sq, sr, st, su, sv, sw, ta, te, tg, th, tr, uk, und, ur, uz, vi, xh, yi, yo, zh, zu。

模型指标

模型在多个任务和数据集上的评估指标如下：

任务类型	数据集	名称	配置	分割	修订版本	指标类型	值
指代消解	winogrande	Winogrande XL (xl)	xl	validation	a80f460359d1e9a67c006011c94de42a8759430c	准确率	52.49
指代消解	Muennighoff/xwinograd	XWinograd (en)	en	test	9dd5ea5505fad86b7bedad667955577815300cee	准确率	61.89
指代消解	Muennighoff/xwinograd	XWinograd (fr)	fr	test	9dd5ea5505fad86b7bedad667955577815300cee	准确率	59.04
指代消解	Muennighoff/xwinograd	XWinograd (jp)	jp	test	9dd5ea5505fad86b7bedad667955577815300cee	准确率	60.27
指代消解	Muennighoff/xwinograd	XWinograd (pt)	pt	test	9dd5ea5505fad86b7bedad667955577815300cee	准确率	66.16
指代消解	Muennighoff/xwinograd	XWinograd (ru)	ru	test	9dd5ea5505fad86b7bedad667955577815300cee	准确率	59.05
指代消解	Muennighoff/xwinograd	XWinograd (zh)	zh	test	9dd5ea5505fad86b7bedad667955577815300cee	准确率	62.9
自然语言推理	anli	ANLI (r1)	r1	validation	9dbd830a06fea8b1c49d6e5ef2004a08d9f45094	准确率	38.2
自然语言推理	anli	ANLI (r2)	r2	validation	9dbd830a06fea8b1c49d6e5ef2004a08d9f45094	准确率	34.8
自然语言推理	anli	ANLI (r3)	r3	validation	9dbd830a06fea8b1c49d6e5ef2004a08d9f45094	准确率	39
自然语言推理	super_glue	SuperGLUE (cb)	cb	validation	9e12063561e7e6c79099feb6d5a493142584e9e2	准确率	85.71
自然语言推理	super_glue	SuperGLUE (rte)	rte	validation	9e12063561e7e6c79099feb6d5a493142584e9e2	准确率	78.7
自然语言推理	xnli	XNLI (ar)	ar	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	51.85
自然语言推理	xnli	XNLI (bg)	bg	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	54.18
自然语言推理	xnli	XNLI (de)	de	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	54.78
自然语言推理	xnli	XNLI (el)	el	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	53.78
自然语言推理	xnli	XNLI (en)	en	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	56.83
自然语言推理	xnli	XNLI (es)	es	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	54.78
自然语言推理	xnli	XNLI (fr)	fr	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	54.22
自然语言推理	xnli	XNLI (hi)	hi	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	50.24
自然语言推理	xnli	XNLI (ru)	ru	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	53.09
自然语言推理	xnli	XNLI (sw)	sw	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	49.6
自然语言推理	xnli	XNLI (th)	th	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	52.13
自然语言推理	xnli	XNLI (tr)	tr	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	50.56
自然语言推理	xnli	XNLI (ur)	ur	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	47.91
自然语言推理	xnli	XNLI (vi)	vi	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	53.21
自然语言推理	xnli	XNLI (zh)	zh	validation	a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16	准确率	50.64
程序合成	openai_humaneval	HumanEval	None	test	e8dc562f5de170c54b5481011dd9f4fa04845771	Pass@1	0
程序合成	openai_humaneval	HumanEval	None	test	e8dc562f5de170c54b5481011dd9f4fa04845771	Pass@10	0
程序合成	openai_humaneval	HumanEval	None	test	e8dc562f5de170c54b5481011dd9f4fa04845771	Pass@100	0
句子完成	story_cloze	StoryCloze (2016)	'2016'	validation	e724c6f8cdf7c7a2fb229d862226e15b023ee4db	准确率	79.1
句子完成	super_glue	SuperGLUE (copa)	copa	validation	9e12063561e7e6c79099feb6d5a493142584e9e2	准确率	72
句子完成	xcopa	XCOPA (et)	et	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	70
句子完成	xcopa	XCOPA (ht)	ht	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	66
句子完成	xcopa	XCOPA (id)	id	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	71
句子完成	xcopa	XCOPA (it)	it	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	70
句子完成	xcopa	XCOPA (qu)	qu	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	56
句子完成	xcopa	XCOPA (sw)	sw	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	53
句子完成	xcopa	XCOPA (ta)	ta	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	64
句子完成	xcopa	XCOPA (th)	th	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	60
句子完成	xcopa	XCOPA (tr)	tr	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	58
句子完成	xcopa	XCOPA (vi)	vi	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	68
句子完成	xcopa	XCOPA (zh)	zh	validation	37f73c60fb123111fa5af5f9b705d0b3747fd187	准确率	65
句子完成	Muennighoff/xstory_cloze	XStoryCloze (ar)	ar	validation	8bb76e594b68147f1a430e86829d07189622b90d	准确率	70.09
句子完成	Muennighoff/xstory_cloze	XStoryCloze (es)	es	validation	8bb76e594b68147f1a430e86829d07189622b90d	准确率	77.17
句子完成	Muennighoff/xstory_cloze	XStoryCloze (eu)	eu	validation	8bb76e594b68147f1a430e86829d07189622b90d	准确率	69.03
句子完成	Muennighoff/xstory_cloze	XStoryCloze (hi)	hi	validation	8bb76e594b68147f1a430e86829d07189622b90d	准确率	71.08
句子完成	Muennighoff/xstory_cloze	XStoryCloze (id)	id	validation	8bb76e594b68147f1a430e86829d07189622b90d	准确率	75.71
句子完成	Muennighoff/xstory_cloze	XStoryCloze (my)	my	validation	8bb76e594b68147f1a430e86829d07189622b90d	准确率	65.65
句子完成	Muennighoff/xstory_cloze	XStoryCloze (ru)	ru	validation	8bb76e594b68147f1a430e86829d07189622b90d	准确率	74.85
句子完成	Muennighoff/xstory_cloze	XStoryCloze (sw)	sw	validation	8bb76e594b68147f1a430e86829d07189622b90d	准确率	71.14
句子完成	Muennighoff/xstory_cloze	XStoryCloze (te)	te	validation	8bb76e594b68147f1a430e86829d07189622b90d	准确率	68.89
句子完成	Muennighoff/xstory_cloze	XStoryCloze (zh)	zh	validation	8bb76e594b68147f1a430e86829d07189622b90d	准确率	72.93

BLOOMZ & mT0模型家族

微调数据集	参数	300M	580M	1.2B	3.7B	13B	560M	1.1B	1.7B	3B	7.1B	176B
在xP3上进行多任务微调。推荐用英语进行提示。	微调模型	[mt0 - small](https://huggingface.co/bigscience/mt0 - small)	[mt0 - base](https://huggingface.co/bigscience/mt0 - base)	[mt0 - large](https://huggingface.co/bigscience/mt0 - large)	[mt0 - xl](https://huggingface.co/bigscience/mt0 - xl)	[mt0 - xxl](https://huggingface.co/bigscience/mt0 - xxl)	[bloomz - 560m](https://huggingface.co/bigscience/bloomz - 560m)	[bloomz - 1b1](https://huggingface.co/bigscience/bloomz - 1b1)	[bloomz - 1b7](https://huggingface.co/bigscience/bloomz - 1b7)	[bloomz - 3b](https://huggingface.co/bigscience/bloomz - 3b)	[bloomz - 7b1](https://huggingface.co/bigscience/bloomz - 7b1)	bloomz
在xP3mt上进行多任务微调。推荐用非英语进行提示。	微调模型					[mt0 - xxl - mt](https://huggingface.co/bigscience/mt0 - xxl - mt)					[bloomz - 7b1 - mt](https://huggingface.co/bigscience/bloomz - 7b1 - mt)	[bloomz - mt](https://huggingface.co/bigscience/bloomz - mt)
在P3上进行多任务微调。仅用于研究目的。严格劣于上述模型！	微调模型					[mt0 - xxl - p3](https://huggingface.co/bigscience/mt0 - xxl - p3)					[bloomz - 7b1 - p3](https://huggingface.co/bigscience/bloomz - 7b1 - p3)	[bloomz - p3](https://huggingface.co/bigscience/bloomz - p3)
原始预训练检查点。不推荐。	预训练模型	[mt5 - small](https://huggingface.co/google/mt5 - small)	[mt5 - base](https://huggingface.co/google/mt5 - base)	[mt5 - large](https://huggingface.co/google/mt5 - large)	[mt5 - xl](https://huggingface.co/google/mt5 - xl)	[mt5 - xxl](https://huggingface.co/google/mt5 - xxl)	[bloom - 560m](https://huggingface.co/bigscience/bloom - 560m)	[bloom - 1b1](https://huggingface.co/bigscience/bloom - 1b1)	[bloom - 1b7](https://huggingface.co/bigscience/bloom - 1b7)	[bloom - 3b](https://huggingface.co/bigscience/bloom - 3b)	[bloom - 7b1](https://huggingface.co/bigscience/bloom - 7b1)	bloom

引用

如果您使用了该模型，请引用以下论文：

@article{muennighoff2022crosslingual,
  title={Crosslingual generalization through multitask finetuning},
  author={Muennighoff, Niklas and Wang, Thomas and Sutawika, Lintang and Roberts, Adam and Biderman, Stella and Scao, Teven Le and Bari, M Saiful and Shen, Sheng and Yong, Zheng-Xin and Schoelkopf, Hailey and others},
  journal={arXiv preprint arXiv:2211.01786},
  year={2022}
}

联系信息

仓库：[bigscience - workshop/xmtf](https://github.com/bigscience - workshop/xmtf)
论文：Crosslingual Generalization through Multitask Finetuning
联系人：Niklas Muennighoff

模型使用限制

提示工程：模型性能可能会因提示不同而有所差异。对于BLOOMZ模型，建议明确输入的结束位置，避免模型继续输入内容。例如，提示“Translate to English: Je t'aime”末尾没有句号（.），可能会导致模型继续续写法语句子。更好的提示示例有“Translate to English: Je t'aime.”、“Translate to English: Je t'aime. Translation:”、“What is "Je t'aime." in English?”，这些提示能让模型明确何时开始回答。此外，建议为模型提供尽可能多的上下文信息。例如，如果希望模型用泰卢固语回答，可告知模型，如“Explain in a sentence in Telugu what is backpropagation in neural networks.”。

模型小部件示例

示例标题	文本
zh - en sentiment	一个传奇的开端，一个不灭的神话，这不仅仅是一部电影，而是作为一个走进新时代的标签，永远彪炳史册。Would you rate the previous review as positive, neutral or negative?
zh - zh sentiment	一个传奇的开端，一个不灭的神话，这不仅仅是一部电影，而是作为一个走进新时代的标签，永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评？
vi - en query	Suggest at least five related search terms to "Mạng neural nhân tạo".
fr - fr query	Proposez au moins cinq mots clés concernant «Réseau de neurones artificiels».
te - en qa	Explain in a sentence in Telugu what is backpropagation in neural networks.
en - en qa	Why is the sky blue?
es - en fable	Write a fairy tale about a troll saving a princess from a dangerous dragon. The fairy tale is a masterpiece that has achieved praise worldwide and its moral is "Heroes Come in All Shapes and Sizes". Story (in Spanish):
hi - en fable	Write a fable about wood elves living in a forest that is suddenly invaded by ogres. The fable is a masterpiece that has achieved praise worldwide and its moral is "Violence is the last refuge of the incompetent". Fable (in Hindi):