Mt0 Xl
M
Mt0 Xl
Developed by bigscience
MT0-XL is a multilingual text generation model that supports multiple languages and tasks, including coreference resolution and natural language inference.
Downloads 2,807
Release Time : 10/27/2022
Model Overview
MT0-XL is a multilingual model based on the Transformer architecture, suitable for various natural language processing tasks such as coreference resolution and natural language inference.
Model Features
Multilingual Support
Supports over 100 languages, suitable for natural language processing tasks worldwide.
Multi-task Processing
Capable of performing various natural language processing tasks, including coreference resolution and natural language inference.
High Performance
Outstanding performance in multiple benchmarks, especially in coreference resolution tasks.
Model Capabilities
Text generation
Coreference resolution
Natural language inference
Use Cases
Natural Language Processing
Coreference Resolution
Used to resolve reference issues in text, improving the accuracy of text understanding.
Achieved an accuracy of 52.49% on the Winogrande XL dataset.
Natural Language Inference
Used to determine the logical relationship between two sentences (such as entailment, contradiction, or neutrality).
Achieved accuracies of 38.2% (r1), 34.8% (r2), and 39% (r3) on the ANLI dataset.
🚀 MT0-XL Model README
This document provides detailed information about the mt0-xl
model, including its datasets, supported languages, license, and performance metrics.
📦 Datasets
The model is trained on the following datasets:
- bigscience/xP3
- mc4
📄 License
The model is released under the apache-2.0
license.
🌐 Supported Languages
The model supports a wide range of languages, including:
- af, am, ar, az, be, bg, bn, ca, ceb, co, cs, cy, da, de, el, en, eo, es, et, eu, fa, fi, fil, fr, fy, ga, gd, gl, gu, ha, haw, hi, hmn, ht, hu, hy, ig, is, it, iw, ja, jv, ka, kk, km, kn, ko, ku, ky, la, lb, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, 'no', ny, pa, pl, ps, pt, ro, ru, sd, si, sk, sl, sm, sn, so, sq, sr, st, su, sv, sw, ta, te, tg, th, tr, uk, und, ur, uz, vi, xh, yi, yo, zh, zu
🛠️ Pipeline Tag
The model's pipeline tag is text2text-generation
.
🧪 Widget Examples
The following are some example texts and their corresponding example titles for the model's widget:
Example Title | Text |
---|---|
zh-en sentiment | 一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。Would you rate the previous review as positive, neutral or negative? |
zh-zh sentiment | 一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评? |
vi-en query | Suggest at least five related search terms to "Mạng neural nhân tạo". |
fr-fr query | Proposez au moins cinq mots clés concernant «Réseau de neurones artificiels». |
te-en qa | Explain in a sentence in Telugu what is backpropagation in neural networks. |
en-en qa | Why is the sky blue? |
es-en fable | Write a fairy tale about a troll saving a princess from a dangerous dragon. The fairy tale is a masterpiece that has achieved praise worldwide and its moral is "Heroes Come in All Shapes and Sizes". Story (in Spanish): |
hi-en fable | Write a fable about wood elves living in a forest that is suddenly invaded by ogres. The fable is a masterpiece that has achieved praise worldwide and its moral is "Violence is the last refuge of the incompetent". Fable (in Hindi): |
📊 Model Performance
Model Name: mt0-xl
The following table shows the model's performance on various tasks and datasets:
Task Type | Dataset Type | Dataset Name | Config | Split | Revision | Metric Type | Value |
---|---|---|---|---|---|---|---|
Coreference resolution | winogrande | Winogrande XL (xl) | xl | validation | a80f460359d1e9a67c006011c94de42a8759430c | Accuracy | 52.49 |
Coreference resolution | Muennighoff/xwinograd | XWinograd (en) | en | test | 9dd5ea5505fad86b7bedad667955577815300cee | Accuracy | 61.89 |
Coreference resolution | Muennighoff/xwinograd | XWinograd (fr) | fr | test | 9dd5ea5505fad86b7bedad667955577815300cee | Accuracy | 59.04 |
Coreference resolution | Muennighoff/xwinograd | XWinograd (jp) | jp | test | 9dd5ea5505fad86b7bedad667955577815300cee | Accuracy | 60.27 |
Coreference resolution | Muennighoff/xwinograd | XWinograd (pt) | pt | test | 9dd5ea5505fad86b7bedad667955577815300cee | Accuracy | 66.16 |
Coreference resolution | Muennighoff/xwinograd | XWinograd (ru) | ru | test | 9dd5ea5505fad86b7bedad667955577815300cee | Accuracy | 59.05 |
Coreference resolution | Muennighoff/xwinograd | XWinograd (zh) | zh | test | 9dd5ea5505fad86b7bedad667955577815300cee | Accuracy | 62.9 |
Natural language inference | anli | ANLI (r1) | r1 | validation | 9dbd830a06fea8b1c49d6e5ef2004a08d9f45094 | Accuracy | 38.2 |
Natural language inference | anli | ANLI (r2) | r2 | validation | 9dbd830a06fea8b1c49d6e5ef2004a08d9f45094 | Accuracy | 34.8 |
Natural language inference | anli | ANLI (r3) | r3 | validation | 9dbd830a06fea8b1c49d6e5ef2004a08d9f45094 | Accuracy | 39 |
Natural language inference | super_glue | SuperGLUE (cb) | cb | validation | 9e12063561e7e6c79099feb6d5a493142584e9e2 | Accuracy | 85.71 |
Natural language inference | super_glue | SuperGLUE (rte) | rte | validation | 9e12063561e7e6c79099feb6d5a493142584e9e2 | Accuracy | 78.7 |
Natural language inference | xnli | XNLI (ar) | ar | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 51.85 |
Natural language inference | xnli | XNLI (bg) | bg | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 54.18 |
Natural language inference | xnli | XNLI (de) | de | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 54.78 |
Natural language inference | xnli | XNLI (el) | el | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 53.78 |
Natural language inference | xnli | XNLI (en) | en | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 56.83 |
Natural language inference | xnli | XNLI (es) | es | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 54.78 |
Natural language inference | xnli | XNLI (fr) | fr | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 54.22 |
Natural language inference | xnli | XNLI (hi) | hi | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 50.24 |
Natural language inference | xnli | XNLI (ru) | ru | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 53.09 |
Natural language inference | xnli | XNLI (sw) | sw | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 49.6 |
Natural language inference | xnli | XNLI (th) | th | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 52.13 |
Natural language inference | xnli | XNLI (tr) | tr | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 50.56 |
Natural language inference | xnli | XNLI (ur) | ur | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 47.91 |
Natural language inference | xnli | XNLI (vi) | vi | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 53.21 |
Natural language inference | xnli | XNLI (zh) | zh | validation | a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16 | Accuracy | 50.64 |
Program synthesis | openai_humaneval | HumanEval | None | test | e8dc562f5de170c54b5481011dd9f4fa04845771 | Pass@1 | 0 |
Program synthesis | openai_humaneval | HumanEval | None | test | e8dc562f5de170c54b5481011dd9f4fa04845771 | Pass@10 | 0 |
Program synthesis | openai_humaneval | HumanEval | None | test | e8dc562f5de170c54b5481011dd9f4fa04845771 | Pass@100 | 0 |
Sentence completion | story_cloze | StoryCloze (2016) | '2016' | validation | e724c6f8cdf7c7a2fb229d862226e15b023ee4db | Accuracy | 79.1 |
Sentence completion | super_glue | SuperGLUE (copa) | copa | validation | 9e12063561e7e6c79099feb6d5a493142584e9e2 | Accuracy | 72 |
Sentence completion | xcopa | XCOPA (et) | et | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 70 |
Sentence completion | xcopa | XCOPA (ht) | ht | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 66 |
Sentence completion | xcopa | XCOPA (id) | id | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 71 |
Sentence completion | xcopa | XCOPA (it) | it | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 70 |
Sentence completion | xcopa | XCOPA (qu) | qu | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 56 |
Sentence completion | xcopa | XCOPA (sw) | sw | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 53 |
Sentence completion | xcopa | XCOPA (ta) | ta | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 64 |
Sentence completion | xcopa | XCOPA (th) | th | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 60 |
Sentence completion | xcopa | XCOPA (tr) | tr | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 58 |
Sentence completion | xcopa | XCOPA (vi) | vi | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 68 |
Sentence completion | xcopa | XCOPA (zh) | zh | validation | 37f73c60fb123111fa5af5f9b705d0b3747fd187 | Accuracy | 65 |
Sentence completion | Muennighoff/xstory_cloze | XStoryCloze (ar) | ar | validation | 8bb76e594b68147f1a430e86829d07189622b90d | Accuracy | 70.09 |
Sentence completion | Muennighoff/xstory_cloze | XStoryCloze (es) | es | validation | 8bb76e594b68147f1a430e86829d07189622b90d | Accuracy | 77.17 |
Phi 2 GGUF
Other
Phi-2 is a small yet powerful language model developed by Microsoft, featuring 2.7 billion parameters, focusing on efficient inference and high-quality text generation.
Large Language Model Supports Multiple Languages
P
TheBloke
41.5M
205
Roberta Large
MIT
A large English language model pre-trained with masked language modeling objectives, using improved BERT training methods
Large Language Model English
R
FacebookAI
19.4M
212
Distilbert Base Uncased
Apache-2.0
DistilBERT is a distilled version of the BERT base model, maintaining similar performance while being more lightweight and efficient, suitable for natural language processing tasks such as sequence classification and token classification.
Large Language Model English
D
distilbert
11.1M
669
Llama 3.1 8B Instruct GGUF
Meta Llama 3.1 8B Instruct is a multilingual large language model optimized for multilingual dialogue use cases, excelling in common industry benchmarks.
Large Language Model English
L
modularai
9.7M
4
Xlm Roberta Base
MIT
XLM-RoBERTa is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data across 100 languages, using masked language modeling as the training objective.
Large Language Model Supports Multiple Languages
X
FacebookAI
9.6M
664
Roberta Base
MIT
An English pre-trained model based on Transformer architecture, trained on massive text through masked language modeling objectives, supporting text feature extraction and downstream task fine-tuning
Large Language Model English
R
FacebookAI
9.3M
488
Opt 125m
Other
OPT is an open pre-trained Transformer language model suite released by Meta AI, with parameter sizes ranging from 125 million to 175 billion, designed to match the performance of the GPT-3 series while promoting open research in large-scale language models.
Large Language Model English
O
facebook
6.3M
198
1
A pretrained model based on the transformers library, suitable for various NLP tasks
Large Language Model
Transformers

1
unslothai
6.2M
1
Llama 3.1 8B Instruct
Llama 3.1 is Meta's multilingual large language model series, featuring 8B, 70B, and 405B parameter scales, supporting 8 languages and code generation, with optimized multilingual dialogue scenarios.
Large Language Model
Transformers Supports Multiple Languages

L
meta-llama
5.7M
3,898
T5 Base
Apache-2.0
The T5 Base Version is a text-to-text Transformer model developed by Google with 220 million parameters, supporting multilingual NLP tasks.
Large Language Model Supports Multiple Languages
T
google-t5
5.4M
702
Featured Recommended AI Models