Opus-mt-tc-big-el-en Open-source Translation Model - Free High-quality Translation from Modern Greek to English

Opus Mt Tc Big El En

Developed by Helsinki-NLP

This is a neural machine translation model from Modern Greek (el) to English (en), part of the OPUS-MT project, designed to provide high-quality translation services.

Machine Translation

Transformers

Supports Multiple Languages#Greek-English Translation #High-Precision Machine Translation #Multilingual Support

Downloads 302

Release Time : 4/13/2022

Model Overview

This model is specifically designed for translating Modern Greek text into English, utilizing the transformer-big architecture. It was trained on data from the OPUS corpus and trained through the Marian NMT framework.

Model Features

High-Quality Translation

Achieved BLEU scores of 33.9 and 68.8 on the flores101-devtest and tatoeba-test-v2021-08-07 datasets, respectively, demonstrating excellent performance.

Multilingual Support

Supports translation from Modern Greek to English, suitable for various application scenarios.

Open-Source License

Released under the cc-by-4.0 license, allowing free use and modification.

Model Capabilities

Text Translation

Multilingual Support

Use Cases

Education

Language Learning Assistance

Helps students translate Greek learning materials into English for better understanding.

Improves learning efficiency and comprehension.

Business

Document Translation

Translates business documents from Greek to English for international communication.

Enhances cross-language communication efficiency.

🚀 opus-mt-tc-big-el-en

A neural machine translation model designed to translate from Modern Greek (1453-) (el) to English (en).

This model is part of the OPUS-MT project, an initiative aimed at making neural machine translation models widely available and accessible for numerous languages worldwide. All models are initially trained using the remarkable framework of Marian NMT, an efficient NMT implementation written in pure C++. The models have been converted to pyTorch using the transformers library by huggingface. Training data is sourced from OPUS, and training pipelines follow the procedures of OPUS-MT-train.

Publications: OPUS-MT – Building open translation services for the World and The Tatoeba Translation Challenge – Realistic Data Sets for Low Resource and Multilingual MT (Please, cite if you use this model.)

@inproceedings{tiedemann-thottingal-2020-opus,
    title = "{OPUS}-{MT} {--} Building open translation services for the World",
    author = {Tiedemann, J{\"o}rg  and Thottingal, Santhosh},
    booktitle = "Proceedings of the 22nd Annual Conference of the European Association for Machine Translation",
    month = nov,
    year = "2020",
    address = "Lisboa, Portugal",
    publisher = "European Association for Machine Translation",
    url = "https://aclanthology.org/2020.eamt-1.61",
    pages = "479--480",
}

@inproceedings{tiedemann-2020-tatoeba,
    title = "The Tatoeba Translation Challenge {--} Realistic Data Sets for Low Resource and Multilingual {MT}",
    author = {Tiedemann, J{\"o}rg},
    booktitle = "Proceedings of the Fifth Conference on Machine Translation",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.wmt-1.139",
    pages = "1174--1182",
}

✨ Features

This is a neural machine translation model for translating from Modern Greek (1453-) (el) to English (en).
It is part of the OPUS - MT project, which aims to make NMT models widely available.
Trained using the Marian NMT framework and converted to pyTorch with the transformers library.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import MarianMTModel, MarianTokenizer

src_text = [
    "Το σχολείο μας έχει εννιά τάξεις.",
    "Άρχισε να τρέχει."
]

model_name = "pytorch-models/opus-mt-tc-big-el-en"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)
translated = model.generate(**tokenizer(src_text, return_tensors="pt", padding=True))

for t in translated:
    print( tokenizer.decode(t, skip_special_tokens=True) )

# expected output:
#     Our school has nine classes.
#     He started running.

Advanced Usage

from transformers import pipeline
pipe = pipeline("translation", model="Helsinki-NLP/opus-mt-tc-big-el-en")
print(pipe("Το σχολείο μας έχει εννιά τάξεις."))

# expected output: Our school has nine classes.

📚 Documentation

Model Info

Property	Details
Release	2022 - 02 - 25
Source Language(s)	ell
Target Language(s)	eng
Model Type	transformer - big
Training Data	opusTCv20210807+bt (source)
Tokenization	SentencePiece (spm32k,spm32k)
Original Model	opusTCv20210807+bt_transformer-big_2022-02-25.zip
More Information	OPUS-MT ell-eng README

Benchmarks

Test set translations: opusTCv20210807+bt_transformer-big_2022-02-25.test.txt
Test set scores: opusTCv20210807+bt_transformer-big_2022-02-25.eval.txt
Benchmark results: benchmark_results.txt
Benchmark output: benchmark_translations.zip

langpair	testset	chr-F	BLEU	#sent	#words
ell-eng	tatoeba-test-v2021-08-07	0.79708	68.8	10899	68682
ell-eng	flores101-devtest	0.61252	33.9	1012	24721

Acknowledgements

The work is supported by the European Language Grid as pilot project 2866, by the FoTran project, funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 771113), and the MeMAD project, funded by the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No 780069. We are also grateful for the generous computational resources and IT infrastructure provided by CSC -- IT Center for Science, Finland.

Model Conversion Info

Property	Details
Transformers Version	4.16.2
OPUS - MT Git Hash	3405783
Port Time	Wed Apr 13 18:48:34 EEST 2022
Port Machine	LM0-400-22516.local

🔧 Technical Details

No technical details (more than 50 - word specific technical descriptions) are provided in the original document, so this section is skipped.

📄 License

The model is licensed under cc - by - 4.0.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご