Deberta V2 Xlarge
DeBERTa is an enhanced BERT decoding model based on the disentangled attention mechanism, surpassing the performance of BERT and RoBERTa on multiple natural language understanding tasks through improved attention mechanisms and enhanced masked decoders.
Downloads 302
Release Time : 3/2/2022
Model Overview
DeBERTa is an improved BERT model that enhances performance on natural language understanding tasks through disentangled attention mechanisms and enhanced masked decoders. The model was trained on 160GB of data, featuring a 24-layer network architecture with a 1536-dimensional hidden layer size, totaling 900 million parameters.
Model Features
Disentangled Attention Mechanism
Effectively captures dependencies in text by separating content and positional attention calculations.
Enhanced Masked Decoder
Improved masked language modeling method enhances the model's contextual understanding.
Large-scale Pretraining
Trained on 160GB of raw data, providing robust language representation capabilities.
Model Capabilities
Text Understanding
Question Answering Systems
Text Classification
Natural Language Inference
Semantic Similarity Calculation
Use Cases
Natural Language Processing
Question Answering System
Build high-performance question answering systems, such as for SQuAD tasks.
Achieved F1/EM scores of 91.4/89.7 on SQuAD 2.0.
Text Classification
Used for text classification tasks like sentiment analysis.
Achieved 97.5% accuracy on the SST-2 sentiment analysis task.
Natural Language Inference
Determines the logical relationship between two pieces of text.
Achieved 91.7/91.9 accuracy on the MNLI task.
Featured Recommended AI Models