DistilBERTオープンソース質疑応答モデル - パラメータが少なく速度が速く、無料でデプロイして正確に質問に答える

ホーム

Distilbert Base Cased Distilled Squad

distilbertによって開発

DistilBERTはBERTの軽量蒸留バージョンで、パラメータ数が40％減少し、速度が60％向上し、95％以上の性能を維持しています。このモデルはSQuAD v1.1データセットで微調整された質問応答専用バージョンです。

質問応答システム英語オープンソースライセンス:Apache-2.0 #質問応答システム #知識蒸留 #効率的な推論

ダウンロード数 220.76k

リリース時間 : 3/2/2022

モデル概要

Transformerベースの軽量英語質問応答モデルで、与えられたテキストから答えを抽出する抽出型質問応答タスクに適しています。

モデル特徴

効率的で軽量

知識蒸留技術により、モデルのサイズが元のBERTより40％減少し、推論速度が60％向上します。

高性能

SQuAD v1.1検証セットで87.1のF1スコアを達成し、元のBERTの88.7の性能に近いです。

質問応答に特化

抽出型質問応答タスクに特化して最適化されており、質問応答システムの開発に直接使用できます。

モデル能力

テキスト理解

質問応答抽出

コンテキスト分析

使用事例

教育テクノロジー

自動解答システム

教科書や参考資料から自動的に質問の答えを抽出します。

SQuADベンチマークテストで87.1のF1スコアを達成しました。

カスタマーサービス

FAQ自動応答

知識ベースのドキュメントから迅速に質問の答えを見つけます。

🚀 DistilBERT base cased distilled SQuAD

このモデルは、質問応答タスクに特化したDistilBERTベースのモデルです。SQuAD v1.1データセットを用いて知識蒸留によりファインチューニングされており、高速かつ高精度な質問応答を実現します。

🚀 クイックスタート

以下のコードを使って、このモデルを始めることができます。

基本的な使用法

>>> from transformers import pipeline
>>> question_answerer = pipeline("question-answering", model='distilbert-base-cased-distilled-squad')

>>> context = r"""
... Extractive Question Answering is the task of extracting an answer from a text given a question. An example     of a
... question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
... a model on a SQuAD task, you may leverage the examples/pytorch/question-answering/run_squad.py script.
... """

>>> result = question_answerer(question="What is a good example of a question answering dataset?",     context=context)
>>> print(
... f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}"
...)

Answer: 'SQuAD dataset', score: 0.5152, start: 147, end: 160

高度な使用法

このモデルをPyTorchで使用する方法は以下の通りです。

from transformers import DistilBertTokenizer, DistilBertModel
import torch
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-cased-distilled-squad')
model = DistilBertModel.from_pretrained('distilbert-base-cased-distilled-squad')

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

print(outputs)

TensorFlowでの使用方法は以下の通りです。

from transformers import DistilBertTokenizer, TFDistilBertForQuestionAnswering
import tensorflow as tf

tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-cased-distilled-squad")
model = TFDistilBertForQuestionAnswering.from_pretrained("distilbert-base-cased-distilled-squad")

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="tf")
outputs = model(**inputs)

answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
tokenizer.decode(predict_answer_tokens)

✨ 主な機能

高速な質問応答：DistilBERTベースの軽量モデルなので、高速に質問に答えることができます。
高精度：SQuAD v1.1データセットでファインチューニングされており、高精度な回答を提供します。

📦 インストール

原READMEにインストール手順が記載されていないため、このセクションは省略されます。

📚 ドキュメント

モデルの詳細

DistilBERTモデルは、ブログ記事 Smaller, faster, cheaper, lighter: Introducing DistilBERT, adistilled version of BERT と論文 DistilBERT, adistilled version of BERT: smaller, faster, cheaper and lighter で提案されました。DistilBERTは、BERTベースを蒸留して学習された小さく、高速で、安価で、軽量なTransformerモデルです。bert-base-uncased よりもパラメータが40%少なく、60%高速に動作し、GLUE言語理解ベンチマークで測定されたBERTの性能の95%以上を維持しています。

このモデルは、DistilBERT-base-cased のファインチューニングチェックポイントであり、SQuAD v1.1 で知識蒸留（2段階目）を使用してファインチューニングされています。

属性	详情
開発者	Hugging Face
モデルタイプ	Transformerベースの言語モデル
言語	英語
ライセンス	Apache 2.0
関連モデル	DistilBERT-base-cased
詳細情報のリソース	- Distil*（このモデルを含む圧縮モデルのクラス）については、このリポジトリを参照 - 知識蒸留と学習手順については、Sanh et al. (2019) を参照

用途

このモデルは質問応答に使用できます。

誤用と範囲外の使用

このモデルは、人々に敵対的または疎外感を与える環境を意図的に作るために使用してはいけません。また、このモデルは人や出来事の事実や真実を表現するように訓練されていないため、そのような内容を生成するためにモデルを使用することは、このモデルの能力の範囲外です。

リスク、制限事項、バイアス

⚠️ 重要提示

このモデルによって生成される言語は、一部の人にとって不快または不快感を与える可能性があり、歴史的および現在のステレオタイプを拡散する可能性があることに、読者は注意する必要があります。

多くの研究で、言語モデルのバイアスと公平性の問題が調査されています（例えば、Sheng et al. (2021) および Bender et al. (2021) を参照）。モデルによって生成される予測には、保護されたクラス、アイデンティティの特徴、および敏感な、社会的、職業的なグループにまたがる不快で有害なステレオタイプが含まれる可能性があります。例えば：

>>> from transformers import pipeline
>>> question_answerer = pipeline("question-answering", model='distilbert-base-cased-distilled-squad')

>>> context = r"""
... Alice is sitting on the bench. Bob is sitting next to her.
... """

>>> result = question_answerer(question="Who is the CEO?", context=context)
>>> print(
... f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}"
...)

Answer: 'Bob', score: 0.7527, start: 32, end: 35

ユーザー（直接および下流の両方）は、モデルのリスク、バイアス、および制限事項を認識する必要があります。

学習

学習データ

distilbert-base-casedモデルは、distilbert-base-uncasedモデルと同じデータを使用して学習されました。distilbert-base-uncasedモデルは、学習データを以下のように説明しています。

DistilBERTは、BERTと同じデータで事前学習されており、それは BookCorpus（11,038冊の未公開の本からなるデータセット）と英語版ウィキペディア（リスト、表、ヘッダーを除く）です。

SQuAD v1.1データセットについて詳しく知るには、SQuAD v1.1データカードを参照してください。

学習手順

前処理

詳細については、distilbert-base-casedモデルカードを参照してください。

事前学習

詳細については、distilbert-base-casedモデルカードを参照してください。

評価

モデルリポジトリで議論されているように、

このモデルは、[SQuAD v1.1] 開発セットでF1スコア87.1に達します（比較のため、BERT bert-base-casedバージョンはF1スコア88.7に達します）。

環境への影響

炭素排出量は、Lacoste et al. (2019) で提示された Machine Learning Impact calculator を使用して推定できます。私たちは、関連する論文に基づいて、使用されたハードウェアタイプと時間を提示しています。これらの詳細は、DistilBERTの学習に関するものであり、SQuADでのファインチューニングは含まれていません。

属性	详情
ハードウェアタイプ	8台の16GB V100 GPU
使用時間	90時間
クラウドプロバイダー	不明
コンピュートリージョン	不明
排出された炭素量	不明

技術的詳細

モデリングアーキテクチャ、目的、コンピュートインフラストラクチャ、および学習の詳細については、関連する論文を参照してください。

引用情報

@inproceedings{sanh2019distilbert,
  title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
  author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},
  booktitle={NeurIPS EMC^2 Workshop},
  year={2019}
}

APA形式:

Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.