SiameseTransQuest - Danish Multilingual Open-Source Translation Quality Estimation Model - Supports Sentence-Level and Word-Level Evaluation, Excellent Performance

Siamesetransquest Da Multilingual

Developed by TransQuest

TransQuest is an open-source framework for translation quality estimation, supporting both sentence-level and word-level evaluations, and has demonstrated excellent performance in the WMT 2020 Quality Estimation task.

Question Answering System

Transformers

OtherOpen Source License:Apache-2.0 #Multilingual Quality Estimation #Reference-free Evaluation #Translation Quality Prediction

Downloads 376

Release Time : 3/2/2022

Model Overview

TransQuest provides reference-free translation quality estimation capabilities, supports multiple language pairs, and can be used in scenarios such as selecting the best translation and assessing translation reliability.

Model Features

Multilingual Support

Supports translation quality estimation for 15 language pairs.

High Performance

Outperforms existing frameworks like OpenKiwi and DeepQuest in the WMT 2020 Quality Estimation task.

Multi-level Evaluation

Supports quality estimation at different granularities: document-level, sentence-level, and word-level.

Dual Evaluation Dimensions

Supports both predicting post-editing effort and direct quality assessment.

Model Capabilities

Translation Quality Estimation

Multilingual Processing

Sentence-level Quality Scoring

Word-level Quality Analysis

Use Cases

Machine Translation

Translation Engine Selection

Helps select the best translation when multiple engines are available.

Improves efficiency in selecting high-quality translations.

Translation Reliability Assessment

Provides end-users with reliability indicators for automated translations.

Enhances user trust in machine translation.

Translation Workflow

Post-translation Decision Making

Determines whether translations can be published directly or require human editing.

Optimizes translation workflows.

🚀 TransQuest: Translation Quality Estimation with Cross-lingual Transformers

The goal of quality estimation (QE) is to evaluate the quality of a translation without access to a reference translation. High-accuracy QE, which can be easily deployed for multiple language pairs, is a crucial missing element in many commercial translation workflows due to its numerous potential applications. It can be used to select the best translation when multiple translation engines are available or to inform end - users about the reliability of automatically translated content. Additionally, QE systems can determine whether a translation can be directly published in a given context, requires human post - editing before publication, or needs to be translated from scratch by a human. Quality estimation can be performed at different levels: document, sentence, and word levels.

With TransQuest, we have open - sourced our research on translation quality estimation. TransQuest also won the sentence - level direct assessment quality estimation shared task in [WMT 2020](http://www.statmt.org/wmt20/quality - estimation - task.html). It outperforms current open - source quality estimation frameworks such as OpenKiwi and DeepQuest.

✨ Features

Sentence - level translation quality estimation in two aspects: predicting post - editing efforts and direct assessment.
Word - level translation quality estimation, capable of predicting the quality of source words, target words, and target gaps.
Outperforms current state - of - the - art quality estimation methods like DeepQuest and OpenKiwi in all experimented languages.
Pre - trained quality estimation models for fifteen language pairs are available on HuggingFace.

📦 Installation

From pip

pip install transquest

From Source

git clone https://github.com/TharinduDR/TransQuest.git
cd TransQuest
pip install -r requirements.txt

💻 Usage Examples

Basic Usage

import torch
from transquest.algo.sentence_level.siamesetransquest.run_model import SiameseTransQuestModel

model = SiameseTransQuestModel("TransQuest/siamesetransquest-da-multilingual")
predictions = model.predict([["Reducerea acestor conflicte este importantă pentru conservare.", "Reducing these conflicts is not important for preservation."]])
print(predictions)

📚 Documentation

For more details, follow the documentation.

Installation - Install TransQuest locally using pip.
Architectures - Check out the architectures implemented in TransQuest
1. Sentence - level Architectures - We have released two architectures; MonoTransQuest and SiameseTransQuest to perform sentence - level quality estimation.
2. Word - level Architecture - We have released MicroTransQuest to perform word - level quality estimation.
Examples - We have provided several examples of how to use TransQuest in recent WMT quality estimation shared tasks.
1. Sentence - level Examples
2. Word - level Examples
Pre - trained Models - We have provided pre - trained quality estimation models for fifteen language pairs covering both sentence - level and word - level
1. Sentence - level Models
2. Word - level Models
Contact - Contact us for any issues with TransQuest

📄 License

This project is licensed under the apache - 2.0 license.

📄 Citations

If you are using the word - level architecture, please consider citing this paper which is accepted to ACL 2021.

@InProceedings{ranasinghe2021,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {An Exploratory Analysis of Multilingual Word Level Quality Estimation with Cross - Lingual Transformers},
booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics},
year = {2021}
}

If you are using the sentence - level architectures, please consider citing these papers which were presented in COLING 2020 and in WMT 2020 at EMNLP 2020.

@InProceedings{transquest:2020a,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest: Translation Quality Estimation with Cross - Lingual Transformers},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
year = {2020}
}

@InProceedings{transquest:2020b,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest at WMT2020: Sentence - Level Direct Assessment},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
year = {2020}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご