RoFormerV2開源中文文本處理模型 - 免費支持各類中文文本處理任務

首頁

Roformer V2 Chinese Char Large

由junnyu開發

RoFormerV2是基於旋轉位置編碼的增強型Transformer模型，由追一科技開發，支持中文文本處理任務。

大型語言模型

Transformers

中文#旋轉位置編碼 #中文預訓練模型 #CLUE基準領先

下載量 84

發布時間 : 3/21/2022

模型概述

RoFormerV2是一種改進的Transformer模型，採用旋轉位置編碼技術，適用於多種中文自然語言處理任務，如文本分類、問答和語言理解等。

模型特點

旋轉位置編碼

採用旋轉位置編碼技術，增強了模型對位置信息的捕捉能力。

多任務學習

支持多任務學習，提升了模型在多種任務上的表現。

改進的分類頭

在原始代碼基礎上增加了2個dropout層、1個全連接層和ReLU激活函數，提升了分類性能。

模型能力

文本分類

問答

語言理解

文本生成

使用案例

自然語言處理

文本分類

在CLUE開發集分類任務中表現優異，如科大訊飛、騰訊新聞等數據集。

在多個數據集上優於BERT和RoBERTa等模型。

問答系統

適用於構建中文問答系統，能夠理解並回答用戶問題。

🚀 RoFormer-V2項目

RoFormer-V2項目提供了基於RoFormer-V2模型的不同版本實現，涵蓋了TensorFlow和PyTorch版本。該項目在多個自然語言處理任務的評測中展現出了優秀的性能，為相關領域的研究和應用提供了有力支持。

🚀 快速開始

安裝

使用pip進行安裝：

pip install roformer==0.4.3

使用示例

基礎用法

import torch
import tensorflow as tf
from transformers import BertTokenizer
from roformer import RoFormerForMaskedLM, TFRoFormerForMaskedLM
text = "今天[MASK]很好，我[MASK]去公園玩。"
tokenizer = BertTokenizer.from_pretrained("junnyu/roformer_v2_chinese_char_large")
pt_model = RoFormerForMaskedLM.from_pretrained("junnyu/roformer_v2_chinese_char_large")
tf_model = TFRoFormerForMaskedLM.from_pretrained(
    "junnyu/roformer_v2_chinese_char_base", from_pt=True
)
pt_inputs = tokenizer(text, return_tensors="pt")
tf_inputs = tokenizer(text, return_tensors="tf")
# pytorch
with torch.no_grad():
    pt_outputs = pt_model(**pt_inputs).logits[0]
pt_outputs_sentence = "pytorch: "
for i, id in enumerate(tokenizer.encode(text)):
    if id == tokenizer.mask_token_id:
        tokens = tokenizer.convert_ids_to_tokens(pt_outputs[i].topk(k=5)[1])
        pt_outputs_sentence += "[" + "||".join(tokens) + "]"
    else:
        pt_outputs_sentence += "".join(
            tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
        )
print(pt_outputs_sentence)
# tf
tf_outputs = tf_model(**tf_inputs, training=False).logits[0]
tf_outputs_sentence = "tf: "
for i, id in enumerate(tokenizer.encode(text)):
    if id == tokenizer.mask_token_id:
        tokens = tokenizer.convert_ids_to_tokens(tf.math.top_k(tf_outputs[i], k=5)[1])
        tf_outputs_sentence += "[" + "||".join(tokens) + "]"
    else:
        tf_outputs_sentence += "".join(
            tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
        )
print(tf_outputs_sentence)
# small
# pytorch: 今天[的||，||是||很||也]很好，我[要||會||是||想||在]去公園玩。
# tf: 今天[的||，||是||很||也]很好，我[要||會||是||想||在]去公園玩。
# base
# pytorch: 今天[我||天||晴||園||玩]很好，我[想||要||會||就||帶]去公園玩。
# tf: 今天[我||天||晴||園||玩]很好，我[想||要||會||就||帶]去公園玩。
# large
# pytorch: 今天[天||氣||我||空||陽]很好，我[又||想||會||就||愛]去公園玩。
# tf: 今天[天||氣||我||空||陽]很好，我[又||想||會||就||愛]去公園玩。

✨ 主要特性

多版本支持：提供了TensorFlow版本和PyTorch版本+TF2.0版本的實現。
- TF版本：https://github.com/ZhuiyiTechnology/roformer-v2
- PyTorch版本+TF2.0版本：https://github.com/JunnYu/RoFormer_pytorch
優秀的評測表現：在CLUE-dev和CLUE-1.0-test榜單的分類任務中，RoFormerV2-pytorch版本取得了較好的成績。

📦 評測對比

CLUE-dev榜單分類任務結果（base+large版本）

	iflytek	tnews	afqmc	cmnli	ocnli	wsc	csl
BERT	60.06	56.80	72.41	79.56	73.93	78.62	83.93
RoBERTa	60.64	58.06	74.05	81.24	76.00	87.50	84.50
RoFormer	60.91	57.54	73.52	80.92	76.07	86.84	84.63
RoFormerV2^*	60.87	56.54	72.75	80.34	75.36	80.92	84.67
GAU-α	61.41	57.76	74.17	81.82	75.86	79.93	85.67
RoFormer-pytorch(本倉庫代碼)	60.60	57.51	74.44	80.79	75.67	86.84	84.77
RoFormerV2-pytorch(本倉庫代碼)	62.87	59.03	76.20	80.85	79.73	87.82	91.87
GAU-α-pytorch（Adafactor）	61.18	57.52	73.42	80.91	75.69	80.59	85.5
GAU-α-pytorch（AdamW wd0.01 warmup0.1）	60.68	57.95	73.08	81.02	75.36	81.25	83.93
RoFormerV2-large-pytorch(本倉庫代碼)	61.75	59.21	76.14	82.35	81.73	91.45	91.5
Chinesebert-large-pytorch	61.25	58.67	74.70	82.65	79.63	87.83	84.97

CLUE-1.0-test榜單分類任務結果（base+large版本）

	iflytek	tnews	afqmc	cmnli	ocnli	wsc	csl
RoFormer-pytorch(本倉庫代碼)	59.54	57.34	74.46	80.23	73.67	80.69	84.57
RoFormerV2-pytorch(本倉庫代碼)	63.15	58.24	75.42	80.59	74.17	83.79	83.73
GAU-α-pytorch（Adafactor）	61.38	57.08	74.05	80.37	73.53	74.83	85.6
GAU-α-pytorch（AdamW wd0.01 warmup0.1）	60.54	57.67	72.44	80.32	72.97	76.55	84.13
RoFormerV2-large-pytorch(本倉庫代碼)	61.85	59.13	76.38	80.97	76.23	85.86	84.33
Chinesebert-large-pytorch	61.54	58.57	74.8	81.94	76.93	79.66	85.1

注

其中RoFormerV2^*表示的是未進行多任務學習的RoFormerV2模型，該模型蘇神並未開源，感謝蘇神的提醒。
其中不帶有pytorch後綴結果都是從GAU-alpha倉庫複製過來的。
其中帶有pytorch後綴的結果都是自己訓練得出的。
蘇神代碼中拿了cls標籤後直接進行了分類，而本倉庫使用瞭如下的分類頭，多了2個dropout，1個dense，1個relu激活。

class RoFormerClassificationHead(nn.Module):
    def __init__(self, config):
        super().__init__()
        self.dense = nn.Linear(config.hidden_size, config.hidden_size)
        self.dropout = nn.Dropout(config.hidden_dropout_prob)
        self.out_proj = nn.Linear(config.hidden_size, config.num_labels)

        self.config = config

    def forward(self, features, **kwargs):
        x = features[:, 0, :]  # take <s> token (equiv. to [CLS])
        x = self.dropout(x)
        x = self.dense(x)
        x = ACT2FN[self.config.hidden_act](x) # 這裡是relu
        x = self.dropout(x)
        x = self.out_proj(x)
        return x

📚 詳細文檔

引用

Bibtex：

@misc{su2021roformer,
      title={RoFormer: Enhanced Transformer with Rotary Position Embedding}, 
      author={Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu},
      year={2021},
      eprint={2104.09864},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@techreport{roformerv2,
  title={RoFormerV2: A Faster and Better RoFormer - ZhuiyiAI},
  author={Jianlin Su, Shengfeng Pan, Bo Wen, Yunfeng Liu},
  year={2022},
  url="https://github.com/ZhuiyiTechnology/roformer-v2",
}