๐ TransQuest: Translation Quality Estimation with Cross-lingual Transformers
The goal of Quality Estimation (QE) is to assess the quality of a translation without referring to a reference translation. High-accuracy QE that can be easily deployed for multiple language pairs is a crucial component in many commercial translation workflows due to its numerous potential applications. It can help select the best translation when multiple translation engines are available or inform end-users about the reliability of automatically translated content. Additionally, QE systems can determine whether a translation can be directly published in a specific context, requires human post - editing before publication, or needs to be translated from scratch by a human. Quality estimation can be performed at different levels: document, sentence, and word levels.
With TransQuest, we have open - sourced our research on translation quality estimation, which also won the sentence - level direct assessment quality estimation shared task in WMT 2020. TransQuest outperforms current open - source quality estimation frameworks such as OpenKiwi and DeepQuest.
๐ Quick Start
Quality Estimation (QE) aims to evaluate translation quality without a reference translation. TransQuest is an open - sourced research project in this area, which has achieved excellent results in relevant tasks.
โจ Features
- Sentence - level Estimation: Conduct sentence - level translation quality estimation in two aspects: predicting post - editing efforts and direct assessment.
- Word - level Estimation: Perform word - level translation quality estimation, capable of predicting the quality of source words, target words, and target gaps.
- High Performance: Outperform current state - of - the - art quality estimation methods like DeepQuest and OpenKiwi in all the languages experimented.
- Pre - trained Models: Provide pre - trained quality estimation models for fifteen language pairs on HuggingFace.
๐ฆ Installation
From pip
pip install transquest
From Source
git clone https://github.com/TharinduDR/TransQuest.git
cd TransQuest
pip install -r requirements.txt
๐ป Usage Examples
Basic Usage
import torch
from transquest.algo.sentence_level.monotransquest.run_model import MonoTransQuestModel
model = MonoTransQuestModel("xlmroberta", "TransQuest/monotransquest-da-any_en", num_labels=1, use_cuda=torch.cuda.is_available())
predictions, raw_outputs = model.predict([["Reducerea acestor conflicte este importantฤ pentru conservare.", "Reducing these conflicts is not important for preservation."]])
print(predictions)
๐ Documentation
For more details, please refer to the following documentation:
- Installation: Install TransQuest locally using pip.
- Architectures:
- Examples:
- Pre - trained Models:
- Contact: Contact us for any issues with TransQuest.
๐ License
This project is licensed under the apache - 2.0 license.
๐ Citations
If you are using the word - level architecture, please consider citing this paper which is accepted to ACL 2021.
@InProceedings{ranasinghe2021,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {An Exploratory Analysis of Multilingual Word Level Quality Estimation with Cross-Lingual Transformers},
booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics},
year = {2021}
}
If you are using the sentence - level architectures, please consider citing these papers which were presented in COLING 2020 and in WMT 2020 at EMNLP 2020.
@InProceedings{transquest:2020a,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest: Translation Quality Estimation with Cross-lingual Transformers},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
year = {2020}
}
@InProceedings{transquest:2020b,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest at WMT2020: Sentence-Level Direct Assessment},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
year = {2020}
}