๐ ko2en_bidirection2
This model is a fine - tuned translation model based on the KETI - AIR/long - ke - t5 - base model. It can effectively perform bidirectional translation between Korean and English, providing high - quality translation services.
๐ Quick Start
This model is a fine - tuned version of [KETI - AIR/long - ke - t5 - base](https://huggingface.co/KETI - AIR/long - ke - t5 - base) on the KETI - AIR/aihub_koenzh_food_translation, KETI - AIR/aihub_scitech_translation, KETI - AIR/aihub_scitech20_translation, KETI - AIR/aihub_socialtech20_translation, KETI - AIR/aihub_spoken_language_translation koen,none,none,none,none dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5716
- Bleu: 51.5949
- Gen Len: 28.8348
โจ Features
- Multi - dataset Training: Trained on multiple high - quality translation datasets, including food, science and technology, and spoken language, ensuring comprehensive translation capabilities.
- High - quality Metrics: Achieved a BLEU score of 51.5949 on the evaluation set, indicating excellent translation quality.
๐ฆ Installation
No specific installation steps are provided in the original document, so this section is skipped.
๐ป Usage Examples
Basic Usage
You can use the following text examples to test the model's translation capabilities:
- KO2EN 1: 'translate_ko2en: IBM ์์จX๋ AI ๋ฐ ๋ฐ์ดํฐ ํ๋ซํผ์ด๋ค. ์ ๋ขฐํ ์ ์๋ ๋ฐ์ดํฐ, ์๋, ๊ฑฐ๋ฒ๋์ค๋ฅผ ๊ฐ๊ณ ํ์ด๋ฐ์ด์
๋ชจ๋ธ ๋ฐ ๋จธ์ ๋ฌ๋ ๊ธฐ๋ฅ์ ํฌํจํ AI ๋ชจ๋ธ์ ํ์ต์ํค๊ณ , ์กฐ์ ํด, ์กฐ์ง ์ ์ฒด์์ ํ์ฉํ๊ธฐ ์ํ ์ ๊ณผ์ ์ ์์ฐ๋ฅด๋ ๊ธฐ์ ๊ณผ ์๋น์ค๋ฅผ ์ ๊ณตํ๋ค.'
- KO2EN 2: 'translate_ko2en: ์ด์ฉ์๋ ์ ๋ขฐํ ์ ์๊ณ ๊ฐ๋ฐฉ๋ ํ๊ฒฝ์์ ์์ ์ ๋ฐ์ดํฐ์ ๋ํด ์์ฒด์ ์ธ AI๋ฅผ ๊ตฌ์ถํ๊ฑฐ๋, ์์ฅ์ ์ถ์๋ AI ๋ชจ๋ธ์ ์ ๊ตํ๊ฒ ์กฐ์ ํ ์ ์๋ค. ๋๊ท๋ชจ๋ก ํ์ฉํ๊ธฐ ์ํ ๋๊ตฌ ์ธํธ, ๊ธฐ์ , ์ธํ๋ผ ๋ฐ ์ ๋ฌธ ์ปจ์คํ
์๋น์ค๋ฅผ ํ์ฉํ ์ ์๋ค.'
- EN2KO 1: 'translate_en2ko: The Seoul Metropolitan Government said Wednesday that it would develop an AI - based congestion monitoring system to provide better information to passengers about crowd density at each subway station.'
- EN2KO 2: 'translate_en2ko: According to Seoul Metro, the operator of the subway service in Seoul, the new service will help analyze the real - time flow of passengers and crowd levels in subway compartments, improving operational efficiency.'
๐ Documentation
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
๐ง Technical Details
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- distributed_type: multi - GPU
- num_devices: 8
- total_train_batch_size: 128
- total_eval_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Bleu |
Gen Len |
0.7004 |
1.0 |
187524 |
0.6461 |
28.0622 |
17.8368 |
0.6176 |
2.0 |
375048 |
0.5967 |
29.3033 |
17.8281 |
0.5642 |
3.0 |
562572 |
0.5716 |
30.0045 |
17.8366 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.12.0
- Datasets 2.8.0
- Tokenizers 0.13.2
๐ License
This model is licensed under the Apache - 2.0 license.