đ eng-hin Translation Model
This project focuses on English to Hindi translation. It provides a specific model and related evaluation data, aiming to achieve high - quality translation between the two languages.
đ Quick Start
The eng - hin project is designed for English to Hindi translation. You can download the original weights and test set translations from the provided links to start exploring its performance.
⨠Features
- Model Type: The model used is
transformer - align
.
- Pre - processing: It employs normalization and SentencePiece (spm32k, spm32k) for pre - processing.
đĻ Installation
There is no specific installation steps provided in the original document.
đ Documentation
Model Information
Property |
Details |
Model Type |
transformer - align |
Source Language |
English (eng) |
Target Language |
Hindi (hin) |
Pre - processing |
normalization + SentencePiece (spm32k, spm32k) |
Download Original Weights |
[opus - 2020 - 06 - 17.zip](https://object.pouta.csc.fi/Tatoeba - MT - models/eng - hin/opus - 2020 - 06 - 17.zip) |
Test Set Translations |
[opus - 2020 - 06 - 17.test.txt](https://object.pouta.csc.fi/Tatoeba - MT - models/eng - hin/opus - 2020 - 06 - 17.test.txt) |
Test Set Scores |
[opus - 2020 - 06 - 17.eval.txt](https://object.pouta.csc.fi/Tatoeba - MT - models/eng - hin/opus - 2020 - 06 - 17.eval.txt) |
System Info
- hf_name: eng - hin
- source_languages: eng
- target_languages: hin
- opus_readme_url: [eng - hin README](https://github.com/Helsinki - NLP/Tatoeba - Challenge/tree/master/models/eng - hin/README.md)
- original_repo: Tatoeba - Challenge
- tags: ['translation']
- languages: ['en', 'hi']
- src_constituents: {'eng'}
- tgt_constituents: {'hin'}
- src_multilingual: False
- tgt_multilingual: False
- prepro: normalization + SentencePiece (spm32k, spm32k)
- url_model: [Model Download](https://object.pouta.csc.fi/Tatoeba - MT - models/eng - hin/opus - 2020 - 06 - 17.zip)
- url_test_set: [Test Set Download](https://object.pouta.csc.fi/Tatoeba - MT - models/eng - hin/opus - 2020 - 06 - 17.test.txt)
- src_alpha3: eng
- tgt_alpha3: hin
- short_pair: en - hi
- chrF2_score: 0.447
- bleu: 16.1
- brevity_penalty: 1.0
- ref_len: 32904.0
- src_name: English
- tgt_name: Hindi
- train_date: 2020 - 06 - 17
- src_alpha2: en
- tgt_alpha2: hi
- prefer_old: False
- long_pair: eng - hin
- helsinki_git_sha: 480fcbe0ee1bf4774bcbe6226ad9f58e63f6c535
- transformers_git_sha: 2207e5d8cb224e954a7cba69fa4ac2309e9ff30b
- port_machine: brutasse
- port_time: 2020 - 08 - 21 - 14:41
Benchmarks
Testset |
BLEU |
chr - F |
newsdev2014.eng.hin |
6.9 |
0.296 |
newstest2014 - hien.eng.hin |
9.9 |
0.323 |
Tatoeba - test.eng.hin |
16.1 |
0.447 |
đ License
This project is licensed under the Apache - 2.0 license.