Erlangshen-DeBERTa-v2-320M-Chinese Open-Source Chinese Model - Empowering Natural Language Understanding Tasks!

Erlangshen DeBERTa V2 320M Chinese

Developed by IDEA-CCNL

Chinese pre-trained language model based on DeBERTa-v2 architecture with 320 million parameters, excelling in natural language understanding tasks

Large Language Model

Transformers

ChineseOpen Source License:Apache-2.0 #Whole Word Masking #Chinese NLU #Disentangled Attention

Downloads 186

Release Time : 9/21/2022

Model Overview

Chinese DeBERTa-v2-Large model employing Whole Word Masking technique, trained on WuDao corpus, suitable for various Chinese NLP tasks

Model Features

Whole Word Masking

Employs whole word masking strategy during pre-training to enhance the model's understanding of Chinese words

Disentangled Attention Mechanism

Based on DeBERTa-v2 architecture, uses disentangled attention mechanism to improve model performance

Large-scale Pre-training

Pre-trained on 180G version of WuDao corpus, possessing strong language understanding capabilities

Model Capabilities

Text Completion

Sentiment Analysis

Text Classification

Natural Language Inference

Masked Language Modeling

Use Cases

Text Analysis

Sentiment Analysis

Analyze text sentiment orientation

Achieved 74.98% accuracy on AFQMC dataset

News Classification

Classify news texts

Achieved 58.17% accuracy on TNEWS dataset

Natural Language Understanding

Natural Language Inference

Determine logical relationship between two texts

Achieved 80.22% accuracy on OCNLI dataset

Text Completion

Predict masked text content

Can accurately predict masked words like 'Li' River in examples

🚀 Erlangshen-DeBERTa-v2-320M-Chinese

A Chinese DeBERTa-v2-Large model with 320M parameters, excelling at NLU tasks and adopting Whole Word Masking.

Main Page: Fengshenbang
Github: Fengshenbang-LM

🚀 Quick Start

This is a Chinese DeBERTa-v2-Large model with 320 million parameters, which is good at handling NLU tasks and uses Whole Word Masking.

✨ Features

NLU Expertise: Specialized in solving NLU tasks.
Whole Word Masking: Adopted in the MLM process.
Chinese Version: Tailored for the Chinese language.

📦 Model Taxonomy

Property	Details
Demand	General
Task	Natural Language Understanding (NLU)
Series	Erlangshen
Model	DeBERTa-v2
Parameter	320M
Extra	Chinese

📚 Documentation

Model Information

Reference paper: DeBERTa: Decoding-enhanced BERT with Disentangled Attention

To obtain a Chinese DeBERTa-v2-large (320M), we pre-trained the model using the WuDao Corpora (180 GB version). We employed Whole Word Masking (wwm) in MLM. Specifically, we used the fengshen framework for pre-training, which took approximately 7 days with 8 A100 (80G) GPUs.

Performance on Downstream Tasks

We present the results (dev set) on the following tasks:

Model	AFQMC	TNEWS1.1	IFLYTEK	OCNLI	CMNLI
RoBERTa-base	0.7406	0.575	0.6036	0.743	0.7973
RoBERTa-large	0.7488	0.5879	0.6152	0.777	0.814
IDEA-CCNL/Erlangshen-DeBERTa-v2-97M-Chinese	0.7405	0.571	0.5977	0.7568	0.807
IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese	0.7498	0.5817	0.6042	0.8022	0.8301
IDEA-CCNL/Erlangshen-DeBERTa-v2-710M-Chinese	0.7549	0.5873	0.6177	0.8012	0.8389

💻 Usage Examples

Basic Usage

from transformers import AutoModelForMaskedLM, AutoTokenizer, FillMaskPipeline
import torch

tokenizer=AutoTokenizer.from_pretrained('IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese', use_fast=False)
model=AutoModelForMaskedLM.from_pretrained('IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese')
text = '桂林是世界闻名的旅游城市,它有[MASK]江。'
fillmask_pipe = FillMaskPipeline(model, tokenizer, device=0)
print(fillmask_pipe(text, top_k=10))

📄 License

This project is licensed under the Apache-2.0 license.

📖 Citation

If you are using the resource for your work, please cite the our paper:

@article{fengshenbang,
  author    = {Jiaxing Zhang and Ruyi Gan and Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen},
  title     = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
  journal   = {CoRR},
  volume    = {abs/2209.02970},
  year      = {2022}
}

You can also cite our website:

@misc{Fengshenbang-LM,
  title={Fengshenbang-LM},
  author={IDEA-CCNL},
  year={2021},
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご