D

Deberta V3 Small

Developed by microsoft
DeBERTa-v3 is an improved natural language understanding model developed by Microsoft, optimized through ELECTRA-style pretraining and gradient-disentangled embedding sharing technology to achieve efficient performance while maintaining a relatively small parameter count.
Downloads 189.23k
Release Time : 3/2/2022

Model Overview

The DeBERTa-v3 Small Model adopts a 6-layer network structure, focusing on natural language understanding tasks, and enhances model efficiency through a disentangled attention mechanism and an enhanced masked decoder.

Model Features

ELECTRA-style Pretraining
Adopts the more efficient ELECTRA pretraining framework to improve model training efficiency
Gradient-disentangled Embedding Sharing
Optimizes embedding layer parameter sharing through innovative gradient-disentangled technology
Disentangled Attention Mechanism
The improved attention mechanism better captures positional and content information in text
Enhanced Masked Decoder
Enhanced masked language modeling capability improves model understanding performance

Model Capabilities

Text Classification
Question Answering Systems
Natural Language Inference

Use Cases

Text Understanding
Question Answering System
Applied to QA datasets such as SQuAD
F1 score of 82.8 on SQuAD 2.0
Text Classification
Applied to natural language inference tasks such as MNLI
MNLI matched/mismatched accuracy of 88.3/87.7
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase