Erlangshen-DeBERTa-v2-97M-Chinese Open-source Model - Free Support for Chinese Natural Language Understanding Tasks

Erlangshen DeBERTa V2 97M Chinese

Developed by IDEA-CCNL

A Chinese DeBERTa-v2 base model specialized in natural language understanding tasks, employing Whole Word Masking with 97 million parameters.

Large Language Model

Transformers

ChineseOpen Source License:Apache-2.0 #Whole Word Masking #Chinese NLU tasks #Lightweight DeBERTa

Downloads 178

Release Time : 7/19/2022

Model Overview

An enhanced BERT model based on disentangled attention mechanism, designed specifically for Chinese natural language understanding tasks, suitable for text classification, sentiment analysis, and similar scenarios.

Model Features

Whole Word Masking

Utilizes Whole Word Masking during pre-training to enhance the model's understanding of Chinese vocabulary.

Disentangled Attention Mechanism

Based on DeBERTa-v2 architecture, employs disentangled attention mechanism to improve model performance.

Chinese Optimization

Specially optimized for Chinese language characteristics, trained on 180G Wudao corpus.

Model Capabilities

Text classification

Sentiment analysis

Natural language inference

Masked language modeling

Use Cases

Text Analysis

News Classification

Classify news texts

57.1% accuracy on TNEWS dataset

App Classification

Classify application descriptions

59.77% accuracy on IFLYTEK dataset

Semantic Understanding

Natural Language Inference

Determine logical relationships between sentences

75.68% accuracy on OCNLI dataset

Chinese Inference

Perform Chinese language inference tasks

80.7% accuracy on CMNLI dataset

🚀 Erlangshen-DeBERTa-v2-97M-Chinese

A Chinese DeBERTa-v2-Base model with 97M parameters, excelling in NLU tasks and adopting Whole Word Masking.

Main Page: Fengshenbang
Github: Fengshenbang-LM

🚀 Quick Start

This is a Chinese DeBERTa-v2-Base model with 97M parameters. It is good at handling NLU tasks and uses Whole Word Masking.

✨ Features

Specialized in NLU tasks.
Adopts Whole Word Masking.
Chinese version of DeBERTa-v2-Base with 97M parameters.

📦 Installation

No installation steps provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import AutoModelForMaskedLM, AutoTokenizer, FillMaskPipeline
import torch

tokenizer=AutoTokenizer.from_pretrained('IDEA-CCNL/Erlangshen-DeBERTa-v2-97M-Chinese', use_fast=False)
model=AutoModelForMaskedLM.from_pretrained('IDEA-CCNL/Erlangshen-DeBERTa-v2-97M-Chinese')
text = '生活的真谛是[MASK]。'
fillmask_pipe = FillMaskPipeline(model, tokenizer, device=7)
print(fillmask_pipe(text, top_k=10))

📚 Documentation

Model Taxonomy

Property	Details
Demand	General
Task	Natural Language Understanding (NLU)
Series	Erlangshen
Model	DeBERTa-v2
Parameter	97M
Extra	Chinese

Model Information

Reference paper: DeBERTa: Decoding-enhanced BERT with Disentangled Attention

To get a Chinese DeBERTa-v2-Base (97M), we use WuDao Corpora (180 GB version) for pre-training. We employ the Whole Word Masking (wwm) in MLM. Specifically, we use the fengshen framework in the pre-training phase which cost about 7 days with 24 A100 GPUs.

Performance on Downstream Tasks

We present the results (dev set) on the following tasks:

Model	AFQMC	TNEWS1.1	IFLYTEK	OCNLI	CMNLI
RoBERTa-base	0.7406	0.575	0.6036	0.743	0.7973
RoBERTa-large	0.7488	0.5879	0.6152	0.777	0.814
IDEA-CCNL/Erlangshen-DeBERTa-v2-97M-Chinese	0.7405	0.571	0.5977	0.7568	0.807
IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese	0.7498	0.5817	0.6042	0.8022	0.8301
IDEA-CCNL/Erlangshen-DeBERTa-v2-710M-Chinese	0.7549	0.5873	0.6177	0.8012	0.8389

📄 License

This project is licensed under the Apache-2.0 license.

📖 Citation

If you are using the resource for your work, please cite the our paper:

@article{fengshenbang,
  author    = {Jiaxing Zhang and Ruyi Gan and Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen},
  title     = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
  journal   = {CoRR},
  volume    = {abs/2209.02970},
  year      = {2022}
}

You can also cite our website:

@misc{Fengshenbang-LM,
  title={Fengshenbang-LM},
  author={IDEA-CCNL},
  year={2021},
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご