monotransquest-da-multilingual开源翻译评估框架 - 免费评估多语言对翻译质量

首页

Monotransquest Da Multilingual

由 TransQuest 开发

TransQuest是一个开源翻译质量评估框架，支持在无需参考译文的情况下评估翻译质量，适用于多种语言对。

问答系统

Transformers

开源协议:Apache-2.0 #多语言质量评估 #无参考译文评估 #翻译质量预测

下载量 183

发布时间 : 3/2/2022

模型简介

TransQuest提供句子级和词汇级的翻译质量评估功能，能够预测后期编辑需求和直接评估翻译质量，支持15种语言对。

模型特点

多语言支持

支持15种语言对的翻译质量评估，适用于多种语言环境。

高性能

在WMT 2020句子级直接评估任务中表现优异，超越OpenKiwi和DeepQuest等现有框架。

预训练模型

提供多种语言对的预训练模型，可直接用于质量评估任务。

多层次评估

支持文档级、句子级和词汇级的翻译质量评估。

模型能力

句子级翻译质量评估

词汇级翻译质量评估

预测后期编辑需求

直接评估翻译质量

使用案例

翻译引擎选择

选择最佳翻译引擎

当多个翻译引擎可用时，使用TransQuest评估各引擎的翻译质量，选择最佳译文。

提高翻译质量，减少后期编辑需求。

翻译内容发布

判断翻译是否可直接发布

使用TransQuest评估翻译内容的质量，判断是否需要人工后期编辑或重新翻译。

确保发布内容的质量，减少错误翻译的风险。

🚀 TransQuest：基于跨语言Transformer的翻译质量评估工具

TransQuest是一款用于翻译质量评估的工具，无需参考译文即可评估翻译质量。在众多商业翻译流程中，高精度且易于部署到多种语言对的质量评估工具是不可或缺的一环，它有着广泛的潜在用途。例如，当有多个翻译引擎可供选择时，可借助它挑选出最佳翻译；还能让终端用户了解自动翻译内容的可靠性。此外，质量评估系统可用于判断译文在特定语境下能否直接发布，还是需要人工后期编辑，亦或是重新进行人工翻译。质量评估可在文档级、句子级和单词级等不同层面进行。

我们开源了在翻译质量评估领域的研究成果——TransQuest，它还在WMT 2020的句子级直接评估质量评估共享任务中获胜。与当前的开源质量评估框架，如OpenKiwi和DeepQuest相比，TransQuest表现更优。

🚀 快速开始

TransQuest可用于评估翻译质量，无需参考译文。它在多个语言对的实验中表现出色，并且提供了预训练模型，方便用户快速使用。

✨ 主要特性

句子级翻译质量评估：能够预测后期编辑工作量和进行直接评估。
单词级翻译质量评估：可预测源单词、目标单词和目标间隙的质量。
性能卓越：在所有实验语言中，超越了如DeepQuest和OpenKiwi等当前最先进的质量评估方法。
预训练模型丰富：在HuggingFace上提供了十五种语言对的预训练质量评估模型。

📦 安装指南

从pip安装

pip install transquest

从源码安装

git clone https://github.com/TharinduDR/TransQuest.git
cd TransQuest
pip install -r requirements.txt

💻 使用示例

基础用法

import torch
from transquest.algo.sentence_level.monotransquest.run_model import MonoTransQuestModel

model = MonoTransQuestModel("xlmroberta", "TransQuest/monotransquest-da-multilingual", num_labels=1, use_cuda=torch.cuda.is_available())
predictions, raw_outputs = model.predict([["Reducerea acestor conflicte este importantă pentru conservare.", "Reducing these conflicts is not important for preservation."]])
print(predictions)

📚 详细文档

更多详细信息请参考以下文档：

安装 - 使用pip在本地安装TransQuest。
架构 - 查看TransQuest实现的架构：
1. 句子级架构 - 我们发布了两种架构，MonoTransQuest和SiameseTransQuest，用于进行句子级质量评估。
2. 单词级架构 - 我们发布了MicroTransQuest，用于进行单词级质量评估。
示例 - 我们提供了几个在近期WMT质量评估共享任务中如何使用TransQuest的示例：
1. 句子级示例
2. 单词级示例
预训练模型 - 我们提供了涵盖句子级和单词级的十五种语言对的预训练质量评估模型：
1. 句子级模型
2. 单词级模型
联系我们 - 如有任何关于TransQuest的问题，请联系我们。

📄 许可证

本项目采用Apache-2.0许可证。

📚 引用信息

如果您使用了单词级架构，请考虑引用这篇已被ACL 2021接受的论文：

@InProceedings{ranasinghe2021,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {An Exploratory Analysis of Multilingual Word Level Quality Estimation with Cross-Lingual Transformers},
booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics},
year = {2021}
}

如果您使用了句子级架构，请考虑引用这些在COLING 2020和WMT 2020（于EMNLP 2020）上发表的论文：

@InProceedings{transquest:2020a,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest: Translation Quality Estimation with Cross-lingual Transformers},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
year = {2020}
}

@InProceedings{transquest:2020b,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest at WMT2020: Sentence-Level Direct Assessment},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
year = {2020}
}