đ traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko
A machine translation model that translates English to Korean, fine - tuned from a pre - trained model using a large - scale dataset.
đ Quick Start
This model, named traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko, is designed for English-to-Korean translation. Here is a simple example of how to use it in Hugging Face's Transformers:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model = AutoModelForSeq2SeqLM.from_pretrained("traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko")
tokenizer = AutoTokenizer.from_pretrained("traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko")
inputs = tokenizer.encode("This is a sample text.", return_tensors="pt")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
⨠Features
- Translation Capability: Specialized in translating English text to Korean.
- Fine - Tuned Model: Fine - tuned from the [KETI - AIR/ke - t5 - base](https://huggingface.co/KETI - AIR/ke - t5 - base) model for better performance on English - to - Korean translation.
đ Documentation
Model Architecture
The model uses the ke - t5 - base architecture, which is based on the T5 (Text - to - Text Transfer Transformer) model.
Training Data
The model was trained on the [aihub - koen - translation - integrated - base - 10m](https://huggingface.co/datasets/traintogpb/aihub - koen - translation - integrated - base - 10m) dataset, which is designed for English - to - Korean translation tasks.
Training Procedure
Training Parameters
The model was trained with the following parameters:
- Learning Rate: 0.0005
- Weight Decay: 0.01
- Batch Size: 64 (training), 128 (evaluation)
- Number of Epochs: 2
- Save Steps: 500
- Max Save Checkpoints: 2
- Evaluation Strategy: At the end of each epoch
- Logging Strategy: No logging
- Use of FP16: No
- Gradient Accumulation Steps: 2
- Reporting: None
Hardware
The training was performed on a single GPU system with an NVIDIA A100 (40GB).
Performance
The model achieved the following BLEU scores during training:
- Epoch 1: 18.006119
- Epoch 2: 18.838066
đ License
This model is released under the Apache - 2.0 license.
Property |
Details |
Model Type |
Machine translation model for English - to - Korean translation |
Training Data |
aihub - koen - translation - integrated - base - 10m |