V3large 2epoch
DeBERTa is an enhanced BERT improvement model based on the disentangled attention mechanism. With 160GB of training data and 1.5 billion parameters, it surpasses the performance of BERT and RoBERTa in multiple natural language understanding tasks.
Downloads 31
Release Time : 3/2/2022
Model Overview
DeBERTa improves the BERT architecture through disentangled attention and enhanced masked decoders, making it particularly suitable for natural language understanding tasks and achieving excellent performance on the GLUE benchmark.
Model Features
Disentangled Attention Mechanism
Enhances the model's ability to understand textual relationships by separating content and position attention calculations.
Enhanced Masked Decoder
Improved masked language modeling objective, enhancing the model's contextual modeling capabilities.
Large-scale Pretraining
Pretrained on 160GB of raw text data with a parameter scale of 1.5 billion.
Model Capabilities
Text Classification
Natural Language Inference
Question Answering System
Semantic Similarity Calculation
Sentence Pair Classification
Use Cases
Text Understanding
Multi-genre Natural Language Inference
Determine the logical relationship between two texts (entailment/contradiction/neutral).
Achieves 91.7/91.9 accuracy on the MNLI dataset.
Sentiment Analysis
Analyze text sentiment (positive/negative).
Achieves 97.2% accuracy on the SST-2 dataset.
Question Answering System
Machine Reading Comprehension
Answer related questions based on given text.
Achieves 92.2/89.7 F1/EM scores on SQuAD 2.0.
Featured Recommended AI Models
Š 2025AIbase