Deberta Base
DeBERTa is an improved BERT model based on the disentangled attention mechanism and enhanced masked decoder, excelling in multiple natural language understanding tasks.
Downloads 298.78k
Release Time : 3/2/2022
Model Overview
DeBERTa enhances the BERT architecture through an innovative disentangled attention mechanism, outperforming BERT and RoBERTa with 80GB of training data.
Model Features
Disentangled Attention Mechanism
Improves attention mechanism's expressive power by separating content and position information processing
Enhanced Masked Decoding
Improved masked prediction mechanism for better contextual dependency capture
Efficient Pretraining
Achieves performance surpassing RoBERTa with just 80GB of training data
Model Capabilities
Masked text prediction
Natural language understanding
Contextual representation learning
Use Cases
Question Answering Systems
SQuAD QA Task
Used for machine reading comprehension tasks
Achieves 93.1/87.2 (F1/EM) on SQuAD 1.1
Text Classification
MNLI Inference Task
Used for natural language inference tasks
Achieves 88.8% accuracy on MNLI-m
Featured Recommended AI Models
Š 2025AIbase