๐ ko2en
This model is a fine - tuned version of [KETI - AIR/long - ke - t5 - base](https://huggingface.co/KETI - AIR/long - ke - t5 - base), designed for high - quality Korean to English translation.
๐ Quick Start
This model is a fine - tuned version of [KETI - AIR/long - ke - t5 - base](https://huggingface.co/KETI - AIR/long - ke - t5 - base) on the KETI - AIR/aihub_koenzh_food_translation, KETI - AIR/aihub_scitech_translation, KETI - AIR/aihub_scitech20_translation, KETI - AIR/aihub_socialtech20_translation, KETI - AIR/aihub_spoken_language_translation koen,none,none,none,none dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5186
- Bleu: 58.7008
- Gen Len: 27.0073
โจ Features
- High - quality Translation: Achieves a BLEU score of 58.7008 on the evaluation set, indicating excellent translation quality.
- Fine - tuned Model: Based on the [KETI - AIR/long - ke - t5 - base](https://huggingface.co/KETI - AIR/long - ke - t5 - base) model, fine - tuned on multiple Korean - related datasets for better performance.
๐ Documentation
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
๐ง Technical Details
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- distributed_type: multi - GPU
- num_devices: 8
- total_train_batch_size: 128
- total_eval_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Bleu |
Gen Len |
0.6234 |
1.0 |
93762 |
0.5843 |
33.9843 |
17.5378 |
0.5334 |
2.0 |
187524 |
0.5369 |
35.3271 |
17.5388 |
0.4704 |
3.0 |
281286 |
0.5186 |
36.0533 |
17.5335 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.12.0
- Datasets 2.8.0
- Tokenizers 0.13.2
๐ License
This model is licensed under the Apache 2.0 license.
๐ฆ Additional Information
Tags
Datasets
- KETI - AIR/aihub_koenzh_food_translation, KETI - AIR/aihub_scitech_translation, KETI - AIR/aihub_scitech20_translation, KETI - AIR/aihub_socialtech20_translation, KETI - AIR/aihub_spoken_language_translation
Metrics
Pipeline tag
Widget examples
- Sample 1:
- text: 'translate_ko2en: IBM ์์จX๋ AI ๋ฐ ๋ฐ์ดํฐ ํ๋ซํผ์ด๋ค. ์ ๋ขฐํ ์ ์๋ ๋ฐ์ดํฐ, ์๋, ๊ฑฐ๋ฒ๋์ค๋ฅผ ๊ฐ๊ณ ํ์ด๋ฐ์ด์
๋ชจ๋ธ ๋ฐ ๋จธ์ ๋ฌ๋ ๊ธฐ๋ฅ์ ํฌํจํ AI ๋ชจ๋ธ์ ํ์ต์ํค๊ณ , ์กฐ์ ํด, ์กฐ์ง ์ ์ฒด์์ ํ์ฉํ๊ธฐ ์ํ ์ ๊ณผ์ ์ ์์ฐ๋ฅด๋ ๊ธฐ์ ๊ณผ ์๋น์ค๋ฅผ ์ ๊ณตํ๋ค.'
- Sample 2:
- text: 'translate_ko2en: ์ด์ฉ์๋ ์ ๋ขฐํ ์ ์๊ณ ๊ฐ๋ฐฉ๋ ํ๊ฒฝ์์ ์์ ์ ๋ฐ์ดํฐ์ ๋ํด ์์ฒด์ ์ธ AI๋ฅผ ๊ตฌ์ถํ๊ฑฐ๋, ์์ฅ์ ์ถ์๋ AI ๋ชจ๋ธ์ ์ ๊ตํ๊ฒ ์กฐ์ ํ ์ ์๋ค. ๋๊ท๋ชจ๋ก ํ์ฉํ๊ธฐ ์ํ ๋๊ตฌ ์ธํธ, ๊ธฐ์ , ์ธํ๋ผ ๋ฐ ์ ๋ฌธ ์ปจ์คํ
์๋น์ค๋ฅผ ํ์ฉํ ์ ์๋ค.'
Model index
- name: ko2en
results:
- task:
type: translation
name: Translation
dataset:
name: KETI - AIR/aihub_koenzh_food_translation, KETI - AIR/aihub_scitech_translation, KETI - AIR/aihub_scitech20_translation, KETI - AIR/aihub_socialtech20_translation, KETI - AIR/aihub_spoken_language_translation koen,none,none,none,none
type: KETI - AIR/aihub_koenzh_food_translation, KETI - AIR/aihub_scitech_translation, KETI - AIR/aihub_scitech20_translation, KETI - AIR/aihub_socialtech20_translation, KETI - AIR/aihub_spoken_language_translation
args: koen,none,none,none,none
metrics:
- type: bleu
value: 58.7008
name: Bleu