StructBERT-large-zhオープンソースモデル - 言語構造を取り入れてテキスト処理を最適化し、情報理解能力を向上させる

ホーム

Structbert Large Zh

junnyuによって開発

StructBERTは言語構造を事前学習プロセスに組み込むことでBERTを拡張した新しいモデルで、2つの補助タスクを通じて単語と文の順序構造を最大限に活用します

大規模言語モデル

Transformers

中国語#中国語事前学習 #構造強化BERT #言語理解

ダウンロード数 77

リリース時間 : 5/18/2022

モデル概要

StructBERTは事前学習に言語構造を組み込むことで改良されたBERTモデルで、単語と文レベルの言語理解能力を向上させています

モデル特徴

構造認識事前学習

2つの補助タスクを通じて単語と文の順序構造を利用した事前学習

深層言語理解

単語と文レベルで言語構造をより良く捉える

大規模事前学習

BERT-largeアーキテクチャベースで3.3億のパラメータを有する

モデル能力

テキスト分類

自然言語推論

意味的類似度計算

質問応答システム

使用事例

自然言語処理

テキスト分類

ニュース分類などのタスクに使用

TNEWSデータセットで68.67%の精度を達成

自然言語推論

文間の論理的関係を判断

CMNLIデータセットで84.47%の精度を達成

🚀 StructBERT: 非公式コピー

このモデルは、言語構造を事前学習に組み込むことで、深い言語理解を実現するためのモデルです。非公式のコピーであり、公式リポジトリの情報を元に再現しています。

🚀 クイックスタート

モデルとトークナイザーのボキャブラリーのダウンロード

wget https://raw.githubusercontent.com/alibaba/AliceMind/main/StructBERT/config/ch_large_bert_config.json && mv ch_large_bert_config.json config.json
wget https://raw.githubusercontent.com/alibaba/AliceMind/main/StructBERT/config/ch_vocab.txt
wget https://alice-open.oss-cn-zhangjiakou.aliyuncs.com/StructBERT/ch_model && mv ch_model pytorch_model.bin

モデルとトークナイザーの読み込みとHFHubへのアップロード

from transformers import BertConfig, BertModel, BertTokenizer
config = BertConfig.from_pretrained("./config.json")
model = BertModel.from_pretrained("./", config=config)
tokenizer = BertTokenizer.from_pretrained("./")
model.push_to_hub("structbert-large-zh")
tokenizer.push_to_hub("structbert-large-zh")

論文リンク

https://arxiv.org/abs/1908.04577

✨ 主な機能

モデルの概要

StructBERTは、BERTを拡張した新しいモデルで、言語構造を事前学習に組み込むことで、単語と文のレベルでの言語構造を活用しています。具体的には、2つの補助タスクで事前学習を行い、単語と文の順序を最大限に活用しています。

事前学習済みモデル

プロパティ	詳細
モデルタイプ	structbert.en.large: StructBERT (BERT-largeアーキテクチャ) structroberta.en.large: StructRoBERTa (RoBERTaからの継続学習) structbert.ch.large: 中国語版StructBERT (BERT-largeアーキテクチャ)
パラメータ数	structbert.en.large: 340M structroberta.en.large: 355M structbert.ch.large: 330M
ダウンロード	structbert.en.large: structbert.en.large structroberta.en.large: 近日公開予定 structbert.ch.large: structbert.ch.large

実験結果

structbert.en.large (GLUEベンチマーク)

モデル	MNLI	QNLIv2	QQP	SST-2	MRPC
structbert.en.large	86.86%	93.04%	91.67%	93.23%	86.51%

structbert.ch.large (CLUEベンチマーク)

モデル	CMNLI	OCNLI	TNEWS	AFQMC
structbert.ch.large	84.47%	81.28%	68.67%	76.11%

📦 インストール

必要条件とインストール

PyTorch バージョン >= 1.0.1
他のライブラリを以下のコマンドでインストール

pip install -r requirements.txt

高速学習のためにNVIDIAのapexライブラリをインストールすることをおすすめします。

💻 使用例

基本的な使用法

MNLIのファインチューニング

python run_classifier_multi_task.py \
  --task_name MNLI \
  --do_train \
  --do_eval \
  --do_test \
  --amp_type O1 \
  --lr_decay_factor 1 \
  --dropout 0.1 \
  --do_lower_case \
  --detach_index -1 \
  --core_encoder bert \
  --data_dir path_to_glue_data \
  --vocab_file config/vocab.txt \
  --bert_config_file config/large_bert_config.json \
  --init_checkpoint path_to_pretrained_model \
  --max_seq_length 128 \
  --train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 3 \
  --fast_train \
  --gradient_accumulation_steps 1 \
  --output_dir path_to_output_dir

📄 ライセンス

著作権表示

このモデルカードは AliceMind Team によって作成されたものではありません。

引用

もしこの研究を使用する場合は、以下の文献を引用してください。

@article{wang2019structbert,
  title={Structbert: Incorporating language structures into pre-training for deep language understanding},
  author={Wang, Wei and Bi, Bin and Yan, Ming and Wu, Chen and Bao, Zuyi and Xia, Jiangnan and Peng, Liwei and Si, Luo},
  journal={arXiv preprint arXiv:1908.04577},
  year={2019}
}