開源TransQuest翻譯質量評估框架 - 免費部署提升翻譯評估精準度

首頁

Monotransquest Da Any En

由TransQuest開發

TransQuest是一個用於翻譯質量評估的開源框架，在WMT 2020句子級直接評估質量評估共享任務中獲勝。

機器翻譯

Transformers

開源協議:Apache-2.0 #翻譯質量評估 #多語言支持 #句子級預測

下載量 29

發布時間 : 3/2/2022

模型概述

TransQuest提供句子級和詞級的翻譯質量評估功能，支持預測後期編輯需求和直接評估，適用於多種語言對。

模型特點

高質量評估性能

在WMT 2020質量評估任務中表現優異，超越OpenKiwi和DeepQuest等現有框架

多語言支持

提供15種語言對的預訓練質量評估模型

多層次評估

支持文檔級、句子級和詞級三個層次的翻譯質量評估

兩種評估方式

支持預測後期編輯需求和直接評估兩種質量評估方式

模型能力

翻譯質量評估

預測後期編輯需求

直接評估翻譯質量

詞級質量評估

句子級質量評估

使用案例

機器翻譯

翻譯引擎選擇

當多個翻譯引擎可用時，用於選擇最佳翻譯結果

提高翻譯質量選擇準確性

翻譯內容可靠性評估

向最終用戶提供自動翻譯內容的可靠性評估

增強用戶對翻譯結果的信任度

翻譯發佈決策

決定是否可以直接發佈翻譯或需要人工後期編輯

優化翻譯工作流程

🚀 TransQuest：基於跨語言Transformer的翻譯質量評估工具

翻譯質量評估（QE）旨在在不參考標準譯文的情況下評估翻譯質量。高精度且易於部署到多種語言對的QE，是許多商業翻譯流程中缺失的一環，具有廣泛的潛在用途。它可用於在多個翻譯引擎輸出中挑選最佳譯文，或向終端用戶告知自動翻譯內容的可靠性。此外，QE系統還能判斷譯文是否可直接發佈，是否需要人工後期編輯，或是否需人工重新翻譯。翻譯質量評估可在文檔級、句子級和單詞級進行。

我們通過 TransQuest 開源了翻譯質量評估方面的研究成果，該成果還在 WMT 2020 的句子級直接評估質量評估共享任務中獲勝。TransQuest 性能優於當前的開源質量評估框架，如 OpenKiwi 和 DeepQuest。

✨ 主要特性

句子級翻譯質量評估：能夠從預測後期編輯工作量和直接評估兩個方面進行句子級翻譯質量評估。
單詞級翻譯質量評估：可預測源單詞、目標單詞和目標空缺的質量。
性能卓越：在所有實驗語言中，表現均優於當前最先進的質量評估方法，如 DeepQuest 和 OpenKiwi。
預訓練模型豐富：在 HuggingFace 上提供了十五種語言對的預訓練質量評估模型。

📦 安裝指南

通過 pip 安裝

pip install transquest

從源代碼安裝

git clone https://github.com/TharinduDR/TransQuest.git
cd TransQuest
pip install -r requirements.txt

💻 使用示例

基礎用法

import torch
from transquest.algo.sentence_level.monotransquest.run_model import MonoTransQuestModel

model = MonoTransQuestModel("xlmroberta", "TransQuest/monotransquest-da-any_en", num_labels=1, use_cuda=torch.cuda.is_available())
predictions, raw_outputs = model.predict([["Reducerea acestor conflicte este importantă pentru conservare.", "Reducing these conflicts is not important for preservation."]])
print(predictions)

📚 詳細文檔

更多詳細信息請參考以下文檔：

安裝指南：介紹如何使用 pip 在本地安裝 TransQuest。
架構說明：查看 TransQuest 實現的架構。
1. 句子級架構：我們發佈了兩種用於句子級質量評估的架構，即 MonoTransQuest 和 SiameseTransQuest。
2. 單詞級架構：我們發佈了用於單詞級質量評估的 MicroTransQuest 架構。
使用示例：提供了在最近的 WMT 質量評估共享任務中使用 TransQuest 的多個示例。
1. 句子級示例
2. 單詞級示例
預訓練模型：提供了涵蓋句子級和單詞級的十五種語言對的預訓練質量評估模型。
1. 句子級模型
2. 單詞級模型
聯繫我們：如果在使用 TransQuest 時遇到任何問題，請聯繫我們。

📄 許可證

本項目採用 Apache-2.0 許可證。

📚 引用說明

如果您使用了單詞級架構，請考慮引用這篇已被 ACL 2021 接受的論文：

@InProceedings{ranasinghe2021,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {An Exploratory Analysis of Multilingual Word Level Quality Estimation with Cross-Lingual Transformers},
booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics},
year = {2021}
}

如果您使用了句子級架構，請考慮引用在 COLING 2020 和 WMT 2020（於 EMNLP 2020 舉辦）上發表的這些論文：

@InProceedings{transquest:2020a,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest: Translation Quality Estimation with Cross-lingual Transformers},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
year = {2020}
}

@InProceedings{transquest:2020b,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest at WMT2020: Sentence-Level Direct Assessment},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
year = {2020}
}