🚀 BLOOMZ & mT0模型项目
BLOOMZ和mT0是一系列能够零样本遵循多种语言人类指令的模型。通过在跨语言任务混合数据集(xP3)上微调BLOOM和mT5预训练多语言模型,这些模型展现出对未见任务和语言的跨语言泛化能力。
🚀 快速开始
本项目主要介绍了BLOOMZ和mT0模型家族,它们在自然语言处理任务中表现出色。你可以根据需求选择不同参数和微调方式的模型。
📚 详细文档
模型概述
BLOOMZ & mT0模型家族
微调数据集 |
参数 |
300M |
580M |
1.2B |
3.7B |
13B |
560M |
1.1B |
1.7B |
3B |
7.1B |
176B |
在xP3上进行多任务微调,推荐用英语提示 |
微调模型 |
[mt0 - small](https://huggingface.co/bigscience/mt0 - small) |
[mt0 - base](https://huggingface.co/bigscience/mt0 - base) |
[mt0 - large](https://huggingface.co/bigscience/mt0 - large) |
[mt0 - xl](https://huggingface.co/bigscience/mt0 - xl) |
[mt0 - xxl](https://huggingface.co/bigscience/mt0 - xxl) |
[bloomz - 560m](https://huggingface.co/bigscience/bloomz - 560m) |
[bloomz - 1b1](https://huggingface.co/bigscience/bloomz - 1b1) |
[bloomz - 1b7](https://huggingface.co/bigscience/bloomz - 1b7) |
[bloomz - 3b](https://huggingface.co/bigscience/bloomz - 3b) |
[bloomz - 7b1](https://huggingface.co/bigscience/bloomz - 7b1) |
bloomz |
在xP3mt上进行多任务微调,推荐用非英语提示 |
微调模型 |
|
|
|
|
[mt0 - xxl - mt](https://huggingface.co/bigscience/mt0 - xxl - mt) |
|
|
|
|
[bloomz - 7b1 - mt](https://huggingface.co/bigscience/bloomz - 7b1 - mt) |
[bloomz - mt](https://huggingface.co/bigscience/bloomz - mt) |
在P3上进行多任务微调,仅用于研究目的,性能不如上述模型 |
微调模型 |
|
|
|
|
[mt0 - xxl - p3](https://huggingface.co/bigscience/mt0 - xxl - p3) |
|
|
|
|
[bloomz - 7b1 - p3](https://huggingface.co/bigscience/bloomz - 7b1 - p3) |
[bloomz - p3](https://huggingface.co/bigscience/bloomz - p3) |
原始预训练检查点,不推荐使用 |
预训练模型 |
[mt5 - small](https://huggingface.co/google/mt5 - small) |
[mt5 - base](https://huggingface.co/google/mt5 - base) |
[mt5 - large](https://huggingface.co/google/mt5 - large) |
[mt5 - xl](https://huggingface.co/google/mt5 - xl) |
[mt5 - xxl](https://huggingface.co/google/mt5 - xxl) |
[bloom - 560m](https://huggingface.co/bigscience/bloom - 560m) |
[bloom - 1b1](https://huggingface.co/bigscience/bloom - 1b1) |
[bloom - 1b7](https://huggingface.co/bigscience/bloom - 1b7) |
[bloom - 3b](https://huggingface.co/bigscience/bloom - 3b) |
[bloom - 7b1](https://huggingface.co/bigscience/bloom - 7b1) |
bloom |
模型使用
预期用途
建议使用该模型执行自然语言表达的任务。例如,给定提示“Translate to English: Je t’aime.”,模型很可能回答“I love you.”。以下是论文中的一些提示示例:
- 一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评?
- Suggest at least five related search terms to "Mạng neural nhân tạo".
- Write a fairy tale about a troll saving a princess from a dangerous dragon. The fairy tale is a masterpiece that has achieved praise worldwide and its moral is "Heroes Come in All Shapes and Sizes". Story (in Spanish):
- Explain in a sentence in Telugu what is backpropagation in neural networks.
数据集信息
属性 |
详情 |
数据集 |
bigscience/xP3 |
许可证 |
bigscience - bloom - rail - 1.0 |
支持语言 |
ak、ar、as、bm、bn、ca、code、en、es、eu、fon、fr、gu、hi、id、ig、ki、kn、lg、ln、ml、mr、ne、nso、ny、or、pa、pt、rn、rw、sn、st、sw、ta、te、tn、ts、tum、tw、ur、vi、wo、xh、yo、zh、zu |
编程语言 |
C、C++、C#、Go、Java、JavaScript、Lua、PHP、Python、Ruby、Rust、Scala、TypeScript |
任务类型 |
文本生成 |
评估结果
模型在多种任务和数据集上进行了评估,以下是部分评估结果:
共指消解任务
数据集 |
配置 |
准确率 |
winogrande |
xl |
55.8 |
Muennighoff/xwinograd |
en |
66.02 |
Muennighoff/xwinograd |
fr |
57.83 |
Muennighoff/xwinograd |
jp |
52.87 |
Muennighoff/xwinograd |
pt |
57.79 |
Muennighoff/xwinograd |
ru |
54.92 |
Muennighoff/xwinograd |
zh |
63.69 |
自然语言推理任务
数据集 |
配置 |
准确率 |
anli |
r1 |
42.1 |
anli |
r2 |
39.5 |
anli |
r3 |
41.0 |
super_glue |
cb |
80.36 |
super_glue |
rte |
84.12 |
xnli |
ar |
53.25 |
xnli |
bg |
43.61 |
xnli |
de |
46.83 |
xnli |
el |
41.53 |
xnli |
en |
59.68 |
xnli |
es |
55.1 |
xnli |
fr |
55.26 |
xnli |
hi |
50.88 |
xnli |
ru |
47.75 |
xnli |
sw |
46.63 |
xnli |
th |
40.12 |
xnli |
tr |
37.55 |
xnli |
ur |
46.51 |
xnli |
vi |
52.93 |
xnli |
zh |
53.61 |
程序合成任务
数据集 |
指标 |
值 |
openai_humaneval |
Pass@1 |
8.06 |
openai_humaneval |
Pass@10 |
15.03 |
openai_humaneval |
Pass@100 |
27.49 |
句子完成任务
数据集 |
配置 |
准确率 |
story_cloze |
2016 |
90.43 |
super_glue |
copa |
86.0 |
xcopa |
et |
50.0 |
xcopa |
ht |
54.0 |
xcopa |
id |
76.0 |
xcopa |
it |
61.0 |
xcopa |
qu |
60.0 |
xcopa |
sw |
63.0 |
xcopa |
ta |
64.0 |
xcopa |
th |
57.0 |
xcopa |
tr |
53.0 |
xcopa |
vi |
79.0 |
xcopa |
zh |
81.0 |
Muennighoff/xstory_cloze |
ar |
83.26 |
Muennighoff/xstory_cloze |
es |
88.95 |
Muennighoff/xstory_cloze |
eu |
73.33 |
Muennighoff/xstory_cloze |
hi |
80.61 |
Muennighoff/xstory_cloze |
id |
84.25 |
Muennighoff/xstory_cloze |
my |
52.55 |
Muennighoff/xstory_cloze |
ru |
65.32 |
Muennighoff/xstory_cloze |
sw |
71.67 |
Muennighoff/xstory_cloze |
te |
74.72 |
Muennighoff/xstory_cloze |
zh |
85.37 |
小部件示例
文本 |
示例标题 |
一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。Would you rate the previous review as positive, neutral or negative? |
zh - en sentiment |
一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评? |
zh - zh sentiment |
Suggest at least five related search terms to "Mạng neural nhân tạo". |
vi - en query |
Proposez au moins cinq mots clés concernant «Réseau de neurones artificiels». |
fr - fr query |
Explain in a sentence in Telugu what is backpropagation in neural networks. |
te - en qa |
Why is the sky blue? |
en - en qa |
Write a fairy tale about a troll saving a princess from a dangerous dragon. The fairy tale is a masterpiece that has achieved praise worldwide and its moral is "Heroes Come in All Shapes and Sizes". Story (in Spanish): |
es - en fable |
Write a fable about wood elves living in a forest that is suddenly invaded by ogres. The fable is a masterpiece that has achieved praise worldwide and its moral is "Violence is the last refuge of the incompetent". Fable (in Hindi): |
hi - en fable |
引用信息
如果你使用了本项目的模型或代码,请引用相关论文:Crosslingual Generalization through Multitask Finetuning。