monotransquest-da-multilingual開源翻譯評估框架 - 免費評估多語言對翻譯質量

首頁

Monotransquest Da Multilingual

由TransQuest開發

TransQuest是一個開源翻譯質量評估框架，支持在無需參考譯文的情況下評估翻譯質量，適用於多種語言對。

問答系統

Transformers

開源協議:Apache-2.0 #多語言質量評估 #無參考譯文評估 #翻譯質量預測

下載量 183

發布時間 : 3/2/2022

模型概述

TransQuest提供句子級和詞彙級的翻譯質量評估功能，能夠預測後期編輯需求和直接評估翻譯質量，支持15種語言對。

模型特點

多語言支持

支持15種語言對的翻譯質量評估，適用於多種語言環境。

高性能

在WMT 2020句子級直接評估任務中表現優異，超越OpenKiwi和DeepQuest等現有框架。

預訓練模型

提供多種語言對的預訓練模型，可直接用於質量評估任務。

多層次評估

支持文檔級、句子級和詞彙級的翻譯質量評估。

模型能力

句子級翻譯質量評估

詞彙級翻譯質量評估

預測後期編輯需求

直接評估翻譯質量

使用案例

翻譯引擎選擇

選擇最佳翻譯引擎

當多個翻譯引擎可用時，使用TransQuest評估各引擎的翻譯質量，選擇最佳譯文。

提高翻譯質量，減少後期編輯需求。

翻譯內容發佈

判斷翻譯是否可直接發佈

使用TransQuest評估翻譯內容的質量，判斷是否需要人工後期編輯或重新翻譯。

確保發佈內容的質量，減少錯誤翻譯的風險。

🚀 TransQuest：基於跨語言Transformer的翻譯質量評估工具

TransQuest是一款用於翻譯質量評估的工具，無需參考譯文即可評估翻譯質量。在眾多商業翻譯流程中，高精度且易於部署到多種語言對的質量評估工具是不可或缺的一環，它有著廣泛的潛在用途。例如，當有多個翻譯引擎可供選擇時，可藉助它挑選出最佳翻譯；還能讓終端用戶瞭解自動翻譯內容的可靠性。此外，質量評估系統可用於判斷譯文在特定語境下能否直接發佈，還是需要人工後期編輯，亦或是重新進行人工翻譯。質量評估可在文檔級、句子級和單詞級等不同層面進行。

我們開源了在翻譯質量評估領域的研究成果——TransQuest，它還在WMT 2020的句子級直接評估質量評估共享任務中獲勝。與當前的開源質量評估框架，如OpenKiwi和DeepQuest相比，TransQuest表現更優。

🚀 快速開始

TransQuest可用於評估翻譯質量，無需參考譯文。它在多個語言對的實驗中表現出色，並且提供了預訓練模型，方便用戶快速使用。

✨ 主要特性

句子級翻譯質量評估：能夠預測後期編輯工作量和進行直接評估。
單詞級翻譯質量評估：可預測源單詞、目標單詞和目標間隙的質量。
性能卓越：在所有實驗語言中，超越瞭如DeepQuest和OpenKiwi等當前最先進的質量評估方法。
預訓練模型豐富：在HuggingFace上提供了十五種語言對的預訓練質量評估模型。

📦 安裝指南

從pip安裝

pip install transquest

從源碼安裝

git clone https://github.com/TharinduDR/TransQuest.git
cd TransQuest
pip install -r requirements.txt

💻 使用示例

基礎用法

import torch
from transquest.algo.sentence_level.monotransquest.run_model import MonoTransQuestModel

model = MonoTransQuestModel("xlmroberta", "TransQuest/monotransquest-da-multilingual", num_labels=1, use_cuda=torch.cuda.is_available())
predictions, raw_outputs = model.predict([["Reducerea acestor conflicte este importantă pentru conservare.", "Reducing these conflicts is not important for preservation."]])
print(predictions)

📚 詳細文檔

更多詳細信息請參考以下文檔：

安裝 - 使用pip在本地安裝TransQuest。
架構 - 查看TransQuest實現的架構：
1. 句子級架構 - 我們發佈了兩種架構，MonoTransQuest和SiameseTransQuest，用於進行句子級質量評估。
2. 單詞級架構 - 我們發佈了MicroTransQuest，用於進行單詞級質量評估。
示例 - 我們提供了幾個在近期WMT質量評估共享任務中如何使用TransQuest的示例：
1. 句子級示例
2. 單詞級示例
預訓練模型 - 我們提供了涵蓋句子級和單詞級的十五種語言對的預訓練質量評估模型：
1. 句子級模型
2. 單詞級模型
聯繫我們 - 如有任何關於TransQuest的問題，請聯繫我們。

📄 許可證

本項目採用Apache-2.0許可證。

📚 引用信息

如果您使用了單詞級架構，請考慮引用這篇已被ACL 2021接受的論文：

@InProceedings{ranasinghe2021,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {An Exploratory Analysis of Multilingual Word Level Quality Estimation with Cross-Lingual Transformers},
booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics},
year = {2021}
}

如果您使用了句子級架構，請考慮引用這些在COLING 2020和WMT 2020（於EMNLP 2020）上發表的論文：

@InProceedings{transquest:2020a,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest: Translation Quality Estimation with Cross-lingual Transformers},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
year = {2020}
}

@InProceedings{transquest:2020b,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest at WMT2020: Sentence-Level Direct Assessment},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
year = {2020}
}