ノミナリゼーション候補分類器オープンソースモデル - イベントの意味を持つ名詞化形式を高精度に識別

ホーム

Nominalization Candidate Classifier

kleinayによって開発

このモデルは、事象性意味を持つ名詞化形式を識別するために使用され、BERTアーキテクチャに基づいてQANomデータセットで微調整されています。

シーケンスラベリング

Transformers

英語#述語性名詞化識別 #事象性意味分類 #BERT微調整

ダウンロード数 52

リリース時間 : 3/2/2022

モデル概要

テキスト中の動詞性意味を持つ名詞化形式を検出する二値分類器で、「construction」（建設）のような暗黙の動作意味を持つ名詞を識別できます。

モデル特徴

述語性名詞化識別

文脈の中で事象性または動作意味を持つ名詞化形式を専門的に識別する

候補語自動抽出

POSアノテーションと語彙リソースを統合して候補名詞化語彙を自動的に選別する

確率閾値調整可能

異なるアプリケーションシナリオの精度/リコール要件に適応するために分類閾値の調整をサポートする

モデル能力

名詞化検出

意味役割アノテーション支援

テキスト意味解析

使用事例

自然言語処理

意味役割アノテーション前処理

テキスト中の事象性名詞を識別してSRLシステムの入力とする

名詞化述語の意味役割アノテーションの効果を向上させる

情報抽出強化

ニューステキストから暗黙の動作を持つ事象情報を抽出する

名詞形式で表現された重要な事象をより多く捕捉する

🚀 名詞化検出器

このモデルは、「述語性名詞化」、つまり文脈の中で事象性（または「動詞性」）の意味を持つ名詞化形式を識別するために使用されます。これはbert-base-casedの事前学習モデルに基づいており、QANomプロジェクト(Kleinら、COLING 2020)で定義およびアノテーションされた「名詞化検出」タスクに対して、ラベル分類の微調整が行われています。

🚀 クイックスタート

このモデルは、候補の名詞化形式を分類するための二元分類器として訓練されています。候補語は、品詞タガー（普通名詞をフィルタリング）と語彙リソース（WordNetやCatVarなど）を通じて抽出され、（少なくとも1つの）派生関連動詞を持つ名詞が選択されます。QANomアノテーションプロジェクトでは、これらの候補語がアノテーターに渡され、文脈の中で「動詞性」の意味を持つかどうかが判断されます。現在のモデルはこの二元分類を再現できます。

✨ 主な機能

bert-base-casedの事前学習モデルに基づき、名詞化検出タスクに対して微調整されています。
文中で事象性の意味を持つ名詞化形式を識別できます。

📦 インストール

候補語抽出アルゴリズムはqanomパッケージで実装されています。以下のコマンドでqanomパッケージをインストールできます。

pip install qanom

💻 使用例

基本的な使用法

from qanom.nominalization_detector import NominalizationDetector
detector = NominalizationDetector()

raw_sentences = ["The construction of the officer 's building finished right after the beginning of the destruction of the previous construction ."]

print(detector(raw_sentences, return_all_candidates=True))
print(detector(raw_sentences, threshold=0.75, return_probability=False))

出力例

[[{'predicate_idx': 1,
   'predicate': 'construction',
   'predicate_detector_prediction': True,
   'predicate_detector_probability': 0.7626778483390808,
   'verb_form': 'construct'},
  {'predicate_idx': 4,
   'predicate': 'officer',
   'predicate_detector_prediction': False,
   'predicate_detector_probability': 0.19832570850849152,
   'verb_form': 'officer'},
  {'predicate_idx': 6,
   'predicate': 'building',
   'predicate_detector_prediction': True,
   'predicate_detector_probability': 0.5794129371643066,
   'verb_form': 'build'},
  {'predicate_idx': 11,
   'predicate': 'beginning',
   'predicate_detector_prediction': True,
   'predicate_detector_probability': 0.8937646150588989,
   'verb_form': 'begin'},
  {'predicate_idx': 14,
   'predicate': 'destruction',
   'predicate_detector_prediction': True,
   'predicate_detector_probability': 0.8501205444335938,
   'verb_form': 'destruct'},
  {'predicate_idx': 18,
   'predicate': 'construction',
   'predicate_detector_prediction': True,
   'predicate_detector_probability': 0.7022264003753662,
   'verb_form': 'construct'}]]

[[{'predicate_idx': 1, 'predicate': 'construction', 'verb_form': 'construct'},
  {'predicate_idx': 11, 'predicate': 'beginning', 'verb_form': 'begin'},
  {'predicate_idx': 14, 'predicate': 'destruction', 'verb_form': 'destruct'}]]

📚 ドキュメント

候補語抽出アルゴリズムの完全なドキュメントは、QANom GitHubリポジトリのREADMEファイルを参照してください。
また、デモも提供しており、これを通じてモデルの機能を直感的に体験できます。

📚 引用

このモデルを使用した場合は、以下の文献を引用してください。

@inproceedings{klein2020qanom,
  title={QANom: Question-Answer driven SRL for Nominalizations},
  author={Klein, Ayal and Mamou, Jonathan and Pyatkin, Valentina and Stepanov, Daniela and He, Hangfeng and Roth, Dan and Zettlemoyer, Luke and Dagan, Ido},
  booktitle={Proceedings of the 28th International Conference on Computational Linguistics},
  pages={3069--3083},
  year={2020}
}