D

Deberta Large

Developed by microsoft
DeBERTa is an improved BERT model that enhances performance through a disentangled attention mechanism and an enhanced masked decoder, surpassing BERT and RoBERTa in multiple natural language understanding tasks.
Downloads 15.07k
Release Time : 3/2/2022

Model Overview

DeBERTa (Decoding-enhanced BERT with disentangled attention) improves the BERT architecture with a disentangled attention mechanism and an enhanced masked decoder, excelling particularly in natural language understanding tasks.

Model Features

Disentangled Attention Mechanism
Decouples content and positional information in the attention mechanism, enhancing the model's understanding of semantic and positional relationships.
Enhanced Masked Decoder
Improved masked language modeling objective function that better captures the contextual dependencies of masked tokens.
Large-scale Pre-training
Pre-trained on 80GB of training data to learn richer language representations.

Model Capabilities

Text Classification
Question Answering Systems
Natural Language Inference
Semantic Similarity Calculation
Linguistic Acceptability Judgment

Use Cases

Academic Research
GLUE Benchmark Testing
Achieves state-of-the-art performance on the General Language Understanding Evaluation benchmark.
Surpasses BERT and RoBERTa on tasks such as MNLI, SST-2, and QNLI.
Industrial Applications
Intelligent Customer Service
Used for question-answering systems and intent recognition.
Achieves an F1 score of 92.2 on the SQuAD 2.0 question-answering task.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase