đ DeBERTa-v3-base-tasksource-nli Model Card
This is a model card for the DeBERTa-v3-base-tasksource-nli
model. It's a fine - tuned version of DeBERTa-v3-base using multi - task learning on over 600 tasks from the tasksource collection. This model shows strong zero - shot validation performance on numerous tasks and can be utilized in multiple ways.
đ Quick Start
Zero - shot Classification
from transformers import pipeline
classifier = pipeline("zero-shot-classification",model="sileod/deberta-v3-base-tasksource-nli")
text = "one day I will see the world"
candidate_labels = ['travel', 'cooking', 'dancing']
classifier(text, candidate_labels)
Natural Language Inference
from transformers import pipeline
pipe = pipeline("text-classification",model="sileod/deberta-v3-base-tasksource-nli")
pipe([dict(text='there is a cat',
text_pair='there is a black cat')])
Tasksource - adapters
import tasknet as tn
pipe = tn.load_pipeline('sileod/deberta-v3-base-tasksource-nli','glue/sst2')
pipe(['That movie was great !', 'Awful movie.'])
Further Fine - tuning
import tasknet as tn
hparams=dict(model_name='sileod/deberta-v3-base-tasksource-nli', learning_rate=2e-5)
model, trainer = tn.Model_Trainer([tn.AutoTask("glue/rte")], hparams)
trainer.train()
⨠Features
- Zero - shot entailment - based classification: Can perform zero - shot classification for arbitrary labels [ZS].
- Natural language inference: Capable of natural language inference tasks [NLI].
- Access to hundreds of tasks: With tasksource - adapters, it provides one - line access to hundreds of tasks [TA].
- Further fine - tuning: Allows for further fine - tuning on new tasks or tasksource tasks (classification, token classification or multiple - choice) [FT].
đĻ Installation
To use this model, you need to install the necessary libraries. For example, for the zero - shot classification and natural language inference pipelines, you need the transformers
library. For using tasksource - adapters and further fine - tuning, you need to install tasknet
.
pip install transformers
pip install tasknet
đ Documentation
Model Details
The model is based on DeBERTa-v3-base and fine - tuned with multi - task learning on 600+ tasks from the tasksource collection.
Evaluation
This model ranked 1st among all models with the microsoft/deberta-v3-base architecture according to the IBM model recycling evaluation. You can find more details here: https://ibm.github.io/model-recycling/.
Software and Training Details
- Training tasks: The model was trained on 600 tasks.
- Training steps: 200k steps.
- Batch size: 384.
- Peak learning rate: 2e - 5.
- Training time: 15 days on Nvidia A30 24GB gpu.
- Model architecture: It's a shared model with the MNLI classifier on top. Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple - choice models used the same classification layers. For classification tasks, models shared weights if their labels matched.
Datasets
The model was trained on a large number of datasets, including but not limited to:
glue
, nyu-mll/multi_nli
, super_glue
, etc.
Metrics
The main metric used for evaluation is accuracy
.
đ§ Technical Details
The model uses multi - task learning on a large number of tasks from the tasksource collection. The specific implementation details can be found in the training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
đ License
This model is licensed under the Apache - 2.0 license.
đ Model Index
Task |
Dataset |
Split |
Metric |
Value |
Text Classification |
glue (config: rte) |
validation |
accuracy |
0.89 |
Natural Language Inference |
anli - r3 (config: plain_text) |
validation |
accuracy |
0.52 |
â ī¸ Important Note
Deprecated: use https://huggingface.co/tasksource/deberta-small-long-nli for longer context and better accuracy.
đĄ Usage Tip
The list of tasks for tasksource - adapters is available in the model config.json. Using tasksource - adapters is more efficient than zero - shot classification as it requires only one forward pass per example, but it is less flexible.
đ Citation
More details on this article:
@article{sileo2023tasksource,
title={tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation},
author={Sileo, Damien},
url= {https://arxiv.org/abs/2301.05948},
journal={arXiv preprint arXiv:2301.05948},
year={2023}
}
đ Model Card Contact
If you have any questions, you can contact damien.sileo@inria.fr.