🚀 RoFormer-V2項目
RoFormer-V2項目提供了基於RoFormer-V2模型的不同版本實現,涵蓋了TensorFlow和PyTorch版本。該項目在多個自然語言處理任務的評測中展現出了優秀的性能,為相關領域的研究和應用提供了有力支持。
🚀 快速開始
安裝
pip install roformer==0.4.3
使用示例
基礎用法
import torch
import tensorflow as tf
from transformers import BertTokenizer
from roformer import RoFormerForMaskedLM, TFRoFormerForMaskedLM
text = "今天[MASK]很好,我[MASK]去公園玩。"
tokenizer = BertTokenizer.from_pretrained("junnyu/roformer_v2_chinese_char_large")
pt_model = RoFormerForMaskedLM.from_pretrained("junnyu/roformer_v2_chinese_char_large")
tf_model = TFRoFormerForMaskedLM.from_pretrained(
"junnyu/roformer_v2_chinese_char_base", from_pt=True
)
pt_inputs = tokenizer(text, return_tensors="pt")
tf_inputs = tokenizer(text, return_tensors="tf")
with torch.no_grad():
pt_outputs = pt_model(**pt_inputs).logits[0]
pt_outputs_sentence = "pytorch: "
for i, id in enumerate(tokenizer.encode(text)):
if id == tokenizer.mask_token_id:
tokens = tokenizer.convert_ids_to_tokens(pt_outputs[i].topk(k=5)[1])
pt_outputs_sentence += "[" + "||".join(tokens) + "]"
else:
pt_outputs_sentence += "".join(
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
)
print(pt_outputs_sentence)
tf_outputs = tf_model(**tf_inputs, training=False).logits[0]
tf_outputs_sentence = "tf: "
for i, id in enumerate(tokenizer.encode(text)):
if id == tokenizer.mask_token_id:
tokens = tokenizer.convert_ids_to_tokens(tf.math.top_k(tf_outputs[i], k=5)[1])
tf_outputs_sentence += "[" + "||".join(tokens) + "]"
else:
tf_outputs_sentence += "".join(
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
)
print(tf_outputs_sentence)
✨ 主要特性
- 多版本支持:提供了TensorFlow版本和PyTorch版本+TF2.0版本的實現。
- 優秀的評測表現:在CLUE-dev和CLUE-1.0-test榜單的分類任務中,RoFormerV2-pytorch版本取得了較好的成績。
📦 評測對比
CLUE-dev榜單分類任務結果(base+large版本)
|
iflytek |
tnews |
afqmc |
cmnli |
ocnli |
wsc |
csl |
BERT |
60.06 |
56.80 |
72.41 |
79.56 |
73.93 |
78.62 |
83.93 |
RoBERTa |
60.64 |
58.06 |
74.05 |
81.24 |
76.00 |
87.50 |
84.50 |
RoFormer |
60.91 |
57.54 |
73.52 |
80.92 |
76.07 |
86.84 |
84.63 |
RoFormerV2* |
60.87 |
56.54 |
72.75 |
80.34 |
75.36 |
80.92 |
84.67 |
GAU-α |
61.41 |
57.76 |
74.17 |
81.82 |
75.86 |
79.93 |
85.67 |
RoFormer-pytorch(本倉庫代碼) |
60.60 |
57.51 |
74.44 |
80.79 |
75.67 |
86.84 |
84.77 |
RoFormerV2-pytorch(本倉庫代碼) |
62.87 |
59.03 |
76.20 |
80.85 |
79.73 |
87.82 |
91.87 |
GAU-α-pytorch(Adafactor) |
61.18 |
57.52 |
73.42 |
80.91 |
75.69 |
80.59 |
85.5 |
GAU-α-pytorch(AdamW wd0.01 warmup0.1) |
60.68 |
57.95 |
73.08 |
81.02 |
75.36 |
81.25 |
83.93 |
RoFormerV2-large-pytorch(本倉庫代碼) |
61.75 |
59.21 |
76.14 |
82.35 |
81.73 |
91.45 |
91.5 |
Chinesebert-large-pytorch |
61.25 |
58.67 |
74.70 |
82.65 |
79.63 |
87.83 |
84.97 |
CLUE-1.0-test榜單分類任務結果(base+large版本)
|
iflytek |
tnews |
afqmc |
cmnli |
ocnli |
wsc |
csl |
RoFormer-pytorch(本倉庫代碼) |
59.54 |
57.34 |
74.46 |
80.23 |
73.67 |
80.69 |
84.57 |
RoFormerV2-pytorch(本倉庫代碼) |
63.15 |
58.24 |
75.42 |
80.59 |
74.17 |
83.79 |
83.73 |
GAU-α-pytorch(Adafactor) |
61.38 |
57.08 |
74.05 |
80.37 |
73.53 |
74.83 |
85.6 |
GAU-α-pytorch(AdamW wd0.01 warmup0.1) |
60.54 |
57.67 |
72.44 |
80.32 |
72.97 |
76.55 |
84.13 |
RoFormerV2-large-pytorch(本倉庫代碼) |
61.85 |
59.13 |
76.38 |
80.97 |
76.23 |
85.86 |
84.33 |
Chinesebert-large-pytorch |
61.54 |
58.57 |
74.8 |
81.94 |
76.93 |
79.66 |
85.1 |
注
- 其中RoFormerV2*表示的是未進行多任務學習的RoFormerV2模型,該模型蘇神並未開源,感謝蘇神的提醒。
- 其中不帶有pytorch後綴結果都是從GAU-alpha倉庫複製過來的。
- 其中帶有pytorch後綴的結果都是自己訓練得出的。
- 蘇神代碼中拿了cls標籤後直接進行了分類,而本倉庫使用瞭如下的分類頭,多了2個dropout,1個dense,1個relu激活。
class RoFormerClassificationHead(nn.Module):
def __init__(self, config):
super().__init__()
self.dense = nn.Linear(config.hidden_size, config.hidden_size)
self.dropout = nn.Dropout(config.hidden_dropout_prob)
self.out_proj = nn.Linear(config.hidden_size, config.num_labels)
self.config = config
def forward(self, features, **kwargs):
x = features[:, 0, :]
x = self.dropout(x)
x = self.dense(x)
x = ACT2FN[self.config.hidden_act](x)
x = self.dropout(x)
x = self.out_proj(x)
return x
📚 詳細文檔
引用
Bibtex:
@misc{su2021roformer,
title={RoFormer: Enhanced Transformer with Rotary Position Embedding},
author={Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu},
year={2021},
eprint={2104.09864},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@techreport{roformerv2,
title={RoFormerV2: A Faster and Better RoFormer - ZhuiyiAI},
author={Jianlin Su, Shengfeng Pan, Bo Wen, Yunfeng Liu},
year={2022},
url="https://github.com/ZhuiyiTechnology/roformer-v2",
}