Deberta Xlarge
DeBERTa improves upon BERT and RoBERTa models with a disentangled attention mechanism and enhanced masked decoder, demonstrating superior performance in most natural language understanding tasks.
Downloads 312
Release Time : 3/2/2022
Model Overview
DeBERTa is an enhanced BERT model that improves performance in natural language understanding tasks through a disentangled attention mechanism and an enhanced masked decoder.
Model Features
Disentangled Attention Mechanism
Improves the model's text comprehension by separating content and positional attention mechanisms.
Enhanced Masked Decoder
An improved masked decoding strategy that enhances performance in masked language modeling tasks.
Large-scale Pretraining
Pretrained on 80GB of training data, outperforming RoBERTa in various natural language understanding tasks.
Model Capabilities
Text understanding
Masked language modeling
Natural language inference
Question answering
Text classification
Use Cases
Natural Language Understanding
Question Answering
Excels on QA datasets like SQuAD 1.1/2.0.
Achieves F1/EM scores of 95.5/90.1 on SQuAD 1.1
Text Classification
Outperforms in text classification tasks on the GLUE benchmark.
Achieves 97.0% accuracy on SST-2 sentiment classification
Natural Language Inference
Excels in NLI tasks like MNLI.
Achieves 91.5/91.2 accuracy on MNLI-m/mm
Featured Recommended AI Models