DistilBERTオープンソース質問応答モデル - 無料でデプロイ可能、パラメータが少なく速度が速く、性能も卓越しています！

ホーム

Distilbert Base Uncased Distilled Squad

distilbertによって開発

DistilBERTはBERTの軽量蒸留バージョンで、パラメータ数が40％減少し、速度が60％向上し、GLUEベンチマークテストでBERTの95％以上の性能を維持します。このモデルは質問応答タスク用に微調整されています。

質問応答システム

Transformers

英語オープンソースライセンス:Apache-2.0 #質問応答システム #軽量BERT #知識蒸留

ダウンロード数 154.39k

リリース時間 : 3/2/2022

モデル概要

DistilBERT-base-uncasedをベースにした微調整モデルで、SQuAD v1.1データセットを使用して知識蒸留により訓練され、英語の質問応答タスクに適しています。

モデル特徴

高効率で軽量

元のBERTモデルと比較して、パラメータ数が40％減少し、推論速度が60％向上します

高性能

GLUEベンチマークテストでBERTの95％以上の性能を維持します

質問応答最適化

SQuAD質問応答タスクに特化して微調整され、SQuAD v1.1で86.9のF1スコアを達成します

モデル能力

抽出型質問応答

テキスト理解

回答位置特定

使用事例

質問応答システム

ドキュメントベースの質問応答

与えられたテキストから質問の答えを抽出する

SQuAD v1.1データセットで86.9のF1スコアを達成します

知識検索

知識ベースから関連情報を検索する

🚀 DistilBERT base uncased distilled SQuAD

DistilBERT base uncased distilled SQuADは、質問応答タスクに特化した軽量で高速なTransformerベースの言語モデルです。BERTを蒸留したDistilBERTをベースに、SQuAD v1.1データセットでファインチューニングされています。

🚀 クイックスタート

以下のコードを使用して、このモデルを始めることができます。

基本的な使用法

>>> from transformers import pipeline
>>> question_answerer = pipeline("question-answering", model='distilbert-base-uncased-distilled-squad')

>>> context = r"""
... Extractive Question Answering is the task of extracting an answer from a text given a question. An example     of a
... question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
... a model on a SQuAD task, you may leverage the examples/pytorch/question-answering/run_squad.py script.
... """

>>> result = question_answerer(question="What is a good example of a question answering dataset?",     context=context)
>>> print(
... f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}"
...)

Answer: 'SQuAD dataset', score: 0.4704, start: 147, end: 160

高度な使用法

PyTorchでの使用例

from transformers import DistilBertTokenizer, DistilBertForQuestionAnswering
import torch
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased-distilled-squad')
model = DistilBertForQuestionAnswering.from_pretrained('distilbert-base-uncased-distilled-squad')

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

answer_start_index = torch.argmax(outputs.start_logits)
answer_end_index = torch.argmax(outputs.end_logits)

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
tokenizer.decode(predict_answer_tokens)

TensorFlowでの使用例

from transformers import DistilBertTokenizer, TFDistilBertForQuestionAnswering
import tensorflow as tf

tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased-distilled-squad")
model = TFDistilBertForQuestionAnswering.from_pretrained("distilbert-base-uncased-distilled-squad")

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="tf")
outputs = model(**inputs)

answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
tokenizer.decode(predict_answer_tokens)

✨ 主な機能

このモデルは質問応答タスクに使用できます。

誤用と範囲外の使用

このモデルは、人に対して敵意的または疎外感を与える環境を意図的に作り出すために使用してはいけません。また、このモデルは人や出来事の事実的または真実の表現を学習するように訓練されていないため、そのような内容を生成するために使用することは、このモデルの能力範囲外です。

🔧 技術詳細

モデル詳細

DistilBERTモデルは、ブログ記事 Smaller, faster, cheaper, lighter: Introducing DistilBERT, adistilled version of BERT と論文 DistilBERT, adistilled version of BERT: smaller, faster, cheaper and lighter で提案されました。DistilBERTは、BERT baseを蒸留して訓練された小さく、速く、安く、軽量なTransformerモデルです。bert-base-uncased よりもパラメータが40%少なく、60%高速に動作し、GLUE言語理解ベンチマークで測定したBERTの性能の95%以上を維持しています。

このモデルは、DistilBERT-base-uncased のファインチューニングチェックポイントであり、SQuAD v1.1 で知識蒸留（2段階目）を使用してファインチューニングされています。

属性	详情
開発者	Hugging Face
モデルタイプ	Transformerベースの言語モデル
言語	英語
ライセンス	Apache 2.0
関連モデル	DistilBERT-base-uncased
詳細情報リソース	- このリポジトリでDistil*（このモデルを含む圧縮モデルのクラス）について詳しく学べます - Sanh et al. (2019) で知識蒸留と訓練手順について詳しく学べます

訓練

訓練データ

distilbert-base-uncasedモデルは、以下のように訓練データを説明しています。

DistilBERTは、BERTと同じデータ、つまり11,038冊の未公開の本からなるデータセット BookCorpus と英語版ウィキペディア（リスト、表、ヘッダーを除く）で事前学習されています。

SQuAD v1.1データセットについて詳しく知るには、SQuAD v1.1データカードを参照してください。

訓練手順

前処理

詳細については、distilbert-base-uncasedモデルカードを参照してください。

事前学習

詳細については、distilbert-base-uncasedモデルカードを参照してください。

評価

モデルリポジトリで議論されているように、

このモデルは、[SQuAD v1.1] 開発セットでF1スコア86.9に達します（比較のため、Bert bert-base-uncasedバージョンはF1スコア88.5に達します）。

環境への影響

炭素排出量は、Lacoste et al. (2019) で提示された Machine Learning Impact calculator を使用して推定できます。以下は、関連論文に基づいたハードウェアタイプと使用時間です。これらの詳細は、DistilBERTの訓練のみに関するもので、SQuADでのファインチューニングは含まれていません。

ハードウェアタイプ：8台の16GB V100 GPU
使用時間：90時間
クラウドプロバイダー：不明
コンピュートリージョン：不明
排出された炭素量：不明

技術仕様

モデリングアーキテクチャ、目的、コンピュートインフラストラクチャ、および訓練の詳細については、関連論文を参照してください。

📄 ライセンス

このモデルはApache 2.0ライセンスの下で提供されています。

📚 引用情報

@inproceedings{sanh2019distilbert,
  title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
  author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},
  booktitle={NeurIPS EMC^2 Workshop},
  year={2019}
}

APA形式:

Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.