🚀 ara-epo
This project focuses on the translation from Arabic to Esperanto. It provides a transformer - align model with specific pre - processing steps and offers related test set translations and scores.
✨ Features
- Language Pair: Translates from Arabic (source group) to Esperanto (target group).
- Model: Utilizes the
transformer - align
model.
- Pre - processing: Applies normalization and SentencePiece (spm4k, spm4k).
- Resources: Offers downloadable original weights, test set translations, and test set scores.
📚 Documentation
ara - epo Details
- Source Group: Arabic
- Target Group: Esperanto
- OPUS Readme: [ara - epo](https://github.com/Helsinki - NLP/Tatoeba - Challenge/tree/master/models/ara - epo/README.md)
- Model:
transformer - align
- Source Languages:
apc
, apc_Latn
, ara
, arq
, arq_Latn
, arz
- Target Languages:
epo
- Pre - processing: normalization + SentencePiece (spm4k, spm4k)
- Download Original Weights: [opus - 2020 - 06 - 16.zip](https://object.pouta.csc.fi/Tatoeba - MT - models/ara - epo/opus - 2020 - 06 - 16.zip)
- Test Set Translations: [opus - 2020 - 06 - 16.test.txt](https://object.pouta.csc.fi/Tatoeba - MT - models/ara - epo/opus - 2020 - 06 - 16.test.txt)
- Test Set Scores: [opus - 2020 - 06 - 16.eval.txt](https://object.pouta.csc.fi/Tatoeba - MT - models/ara - epo/opus - 2020 - 06 - 16.eval.txt)
Benchmarks
testset |
BLEU |
chr - F |
Tatoeba - test.ara.epo |
18.9 |
0.376 |
System Info
Property |
Details |
hf_name |
ara - epo |
source_languages |
ara |
target_languages |
epo |
opus_readme_url |
[https://github.com/Helsinki - NLP/Tatoeba - Challenge/tree/master/models/ara - epo/README.md](https://github.com/Helsinki - NLP/Tatoeba - Challenge/tree/master/models/ara - epo/README.md) |
original_repo |
Tatoeba - Challenge |
tags |
['translation'] |
languages |
['ar', 'eo'] |
src_constituents |
{'apc', 'ara', 'arq_Latn', 'arq', 'afb', 'ara_Latn', 'apc_Latn', 'arz'} |
tgt_constituents |
{'epo'} |
src_multilingual |
False |
tgt_multilingual |
False |
prepro |
normalization + SentencePiece (spm4k, spm4k) |
url_model |
[https://object.pouta.csc.fi/Tatoeba - MT - models/ara - epo/opus - 2020 - 06 - 16.zip](https://object.pouta.csc.fi/Tatoeba - MT - models/ara - epo/opus - 2020 - 06 - 16.zip) |
url_test_set |
[https://object.pouta.csc.fi/Tatoeba - MT - models/ara - epo/opus - 2020 - 06 - 16.test.txt](https://object.pouta.csc.fi/Tatoeba - MT - models/ara - epo/opus - 2020 - 06 - 16.test.txt) |
src_alpha3 |
ara |
tgt_alpha3 |
epo |
short_pair |
ar - eo |
chrF2_score |
0.376 |
bleu |
18.9 |
brevity_penalty |
0.948 |
ref_len |
4506.0 |
src_name |
Arabic |
tgt_name |
Esperanto |
train_date |
2020 - 06 - 16 |
src_alpha2 |
ar |
tgt_alpha2 |
eo |
prefer_old |
False |
long_pair |
ara - epo |
helsinki_git_sha |
480fcbe0ee1bf4774bcbe6226ad9f58e63f6c535 |
transformers_git_sha |
2207e5d8cb224e954a7cba69fa4ac2309e9ff30b |
port_machine |
brutasse |
port_time |
2020 - 08 - 21 - 14:41 |
📄 License
This project is licensed under the Apache - 2.0 license.