ke-t5-base-aihub-koen Open-source English to Korean Translation Model

Ke T5 Base Aihub Koen Translation Integrated 10m En To Ko

Developed by seongs

This is an English-to-Korean machine translation model based on the T5 architecture, fine-tuned on a specific dataset to provide high-quality translation results.

Machine Translation

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #English-Korean Translation #T5 Architecture Optimization #High-precision Translation

Downloads 451

Release Time : 4/28/2024

Model Overview

This model focuses on the English-to-Korean translation task, using the ke-t5-base architecture and trained on the aihub-koen-translation dataset, capable of providing accurate translation services.

Model Features

Precise Translation

Focusing on the English-to-Korean translation task, it can provide relatively accurate translation results.

Mature Architecture

Using the ke-t5-base architecture based on T5, it has good text processing capabilities.

Model Capabilities

English-to-Korean Translation

Use Cases

Translation Application

Document Translation

Translate English documents into Korean

Provide smooth and accurate Korean translation results

Real-time Translation

Used in application scenarios that require real-time English-to-Korean translation

Respond quickly and provide highly readable translations

🚀 traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko

A machine translation model that translates English to Korean, fine - tuned from a pre - trained model using a large - scale dataset.

🚀 Quick Start

This model, named traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko, is designed for English-to-Korean translation. Here is a simple example of how to use it in Hugging Face's Transformers:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained("traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko")
tokenizer = AutoTokenizer.from_pretrained("traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko")

inputs = tokenizer.encode("This is a sample text.", return_tensors="pt")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

✨ Features

Translation Capability: Specialized in translating English text to Korean.
Fine - Tuned Model: Fine - tuned from the [KETI - AIR/ke - t5 - base](https://huggingface.co/KETI - AIR/ke - t5 - base) model for better performance on English - to - Korean translation.

📚 Documentation

Model Architecture

The model uses the ke - t5 - base architecture, which is based on the T5 (Text - to - Text Transfer Transformer) model.

Training Data

The model was trained on the [aihub - koen - translation - integrated - base - 10m](https://huggingface.co/datasets/traintogpb/aihub - koen - translation - integrated - base - 10m) dataset, which is designed for English - to - Korean translation tasks.

Training Procedure

Training Parameters

The model was trained with the following parameters:

Learning Rate: 0.0005
Weight Decay: 0.01
Batch Size: 64 (training), 128 (evaluation)
Number of Epochs: 2
Save Steps: 500
Max Save Checkpoints: 2
Evaluation Strategy: At the end of each epoch
Logging Strategy: No logging
Use of FP16: No
Gradient Accumulation Steps: 2
Reporting: None

Hardware

The training was performed on a single GPU system with an NVIDIA A100 (40GB).

Performance

The model achieved the following BLEU scores during training:

Epoch 1: 18.006119
Epoch 2: 18.838066

📄 License

This model is released under the Apache - 2.0 license.

Property	Details
Model Type	Machine translation model for English - to - Korean translation
Training Data	aihub - koen - translation - integrated - base - 10m

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご