Bibert - endeオープンソースのバイリンガル事前学習言語モデル - 英語とドイツ語の翻訳を最適化し、翻訳性能を向上させる

ホーム

Bibert Ende

jhu-clspによって開発

BiBERT-endeは、ニューラル機械翻訳（NMT）に特化して最適化されたバイリンガル（英語 - ドイツ語）事前学習言語モデルで、コンテキスト埋め込みを提供することで翻訳性能を向上させます。

機械翻訳

Transformers

複数言語対応#バイリンガル事前学習 #ニューラル機械翻訳 #コンテキスト埋め込み

ダウンロード数 40

リリース時間 : 3/2/2022

モデル概要

BiBERT-endeは、コンテキスト埋め込みを直接NMTエンコーダーに入力することで、既存の事前学習モデルの統合プロセスを簡素化し、最先端の翻訳性能を実現するように設計されたカスタマイズされたバイリンガル事前学習言語モデルです。

モデル特徴

バイリンガル事前学習

英語とドイツ語に特化したバイリンガル事前学習により、異言語間のコンテキスト理解が最適化されます。

簡素化された統合

コンテキスト埋め込みを直接NMTエンコーダーの入力として使用することで、事前学習モデルの統合プロセスが簡素化されます。

ランダム層選択

ランダム層選択方法を提案し、コンテキスト埋め込みの異なるレベルの特徴を十分に活用します。

双方向翻訳モデル

双方向翻訳（英語→ドイツ語とドイツ語→英語）をサポートし、両方向で高い性能を実現します。

モデル能力

英語からドイツ語への機械翻訳

ドイツ語から英語への機械翻訳

コンテキスト埋め込み生成

使用事例

機械翻訳

IWSLT'14データセットの翻訳

IWSLT'14データセットで、英語→ドイツ語で30.45、ドイツ語→英語で38.61のBLEUスコアを達成します。

すべての公開された結果を上回ります

WMT'14データセットの翻訳

WMT'14データセットで、英語→ドイツ語で31.26、ドイツ語→英語で34.94のBLEUスコアを達成します。

すべての公開された結果を上回ります

🚀 bibert - ende

bibert - endeは、英語とドイツ語のバイリンガル言語モデルです。詳細については、EMNLP 2021の論文を参照してください。

🚀 クイックスタート

私たちのbibert - endeは、英語とドイツ語のバイリンガル言語モデルです。詳細については、EMNLP 2021の論文 "BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation" をご覧ください。

@inproceedings{xu-etal-2021-bert,
    title = "{BERT}, m{BERT}, or {B}i{BERT}? A Study on Contextualized Embeddings for Neural Machine Translation",
    author = "Xu, Haoran  and
      Van Durme, Benjamin  and
      Murray, Kenton",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.534",
    pages = "6663--6675",
    abstract = "The success of bidirectional encoders using masked language models, such as BERT, on numerous natural language processing tasks has prompted researchers to attempt to incorporate these pre-trained models into neural machine translation (NMT) systems. However, proposed methods for incorporating pre-trained models are non-trivial and mainly focus on BERT, which lacks a comparison of the impact that other pre-trained models may have on translation performance. In this paper, we demonstrate that simply using the output (contextualized embeddings) of a tailored and suitable bilingual pre-trained language model (dubbed BiBERT) as the input of the NMT encoder achieves state-of-the-art translation performance. Moreover, we also propose a stochastic layer selection approach and a concept of a dual-directional translation model to ensure the sufficient utilization of contextualized embeddings. In the case of without using back translation, our best models achieve BLEU scores of 30.45 for En→De and 38.61 for De→En on the IWSLT{'}14 dataset, and 31.26 for En→De and 34.94 for De→En on the WMT{'}14 dataset, which exceeds all published numbers.",
}

📦 インストール

トークナイザーパッケージは BertTokenizer であり、AutoTokenizer ではないことに注意してください。

from transformers import BertTokenizer, AutoModel
tokenizer = BertTokenizer.from_pretrained("jhu-clsp/bibert-ende")
model = AutoModel.from_pretrained("jhu-clsp/bibert-ende")