๐ HyperCLOVAX-SEED-Text-Instruct-0.5B
HyperCLOVAX-SEED-Text-Instruct-0.5B is a text-to-text model with instruction-following capabilities, excelling in understanding Korean language and culture. It offers lightweight deployment for resource-constrained environments.

๐ Quick Start
The HyperCLOVAX-SEED-Text-Instruct-0.5B model is readily available for use. You can follow the usage example below to start interacting with the model.
โจ Features
- Korean Proficiency: Demonstrates excellent understanding of Korean language and culture.
- Mathematical Performance: Shows improved mathematical abilities compared to similar-scale competitors.
- Lightweight Design: The smallest model released by HyperCLOVAX, suitable for resource-constrained environments like edge devices.
- Cost-Effective: Significantly lower training costs compared to industry-leading competitors of similar scale.
๐ฆ Installation
This section is not provided in the original document, so it is skipped.
๐ป Usage Examples
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B").to(device="cuda")
tokenizer = AutoTokenizer.from_pretrained("naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B")
chat = [
{"role": "tool_list", "content": ""},
{"role": "system", "content": "- AI ์ธ์ด๋ชจ๋ธ์ ์ด๋ฆ์ \"CLOVA X\" ์ด๋ฉฐ ๋ค์ด๋ฒ์์ ๋ง๋ค์๋ค.\n- ์ค๋์ 2025๋
04์ 24์ผ(๋ชฉ)์ด๋ค."},
{"role": "user", "content": "์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์๊ณผ ์์์ญํ์ ๊ด๊ณ๋ฅผ ์ต๋ํ ์์ธํ ์๋ ค์ค."},
]
inputs = tokenizer.apply_chat_template(chat, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = inputs.to(device="cuda")
output_ids = model.generate(**inputs, max_length=1024, stop_strings=["<|endofturn|>", "<|stop|>"], repetition_penalty=1.2, tokenizer=tokenizer)
print(tokenizer.batch_decode(output_ids))
Result
['<|im_start|>tool_list\n<|im_end|>\n<|im_start|>system\n- AI ์ธ์ด๋ชจ๋ธ์ ์ด๋ฆ์ "CLOVA X" ์ด๋ฉฐ ๋ค์ด๋ฒ์์ ๋ง๋ค์๋ค.\n- ์ค๋์ 2025๋
04์ 24์ผ(๋ชฉ)์ด๋ค.<|im_end|>\n<|im_start|>user\n์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์๊ณผ ์์์ญํ์ ๊ด๊ณ๋ฅผ ์ต๋ํ ์์ธํ ์๋ ค์ค.<|im_end|>\n<|im_start|>assistant\n์์์ญํ์ ์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์์ ํตํด ๋ฌผ์ง๊ณผ ์๋์ง, ๊ณต๊ฐ ๋ฑ์ ํ์์ ์ค๋ช
ํฉ๋๋ค.\n\n**1. ์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์**\n\n์๋ขฐ๋ฉ๊ฑฐ๋ ํ๋ํจ์๋ฅผ ์ด์ฉํ์ฌ ์
์์ ์์น์ ์ด๋๋์ ๊ณ์ฐํ ์ ์๋ค๊ณ ์ฃผ์ฅํ์ต๋๋ค. ์ด๋ฅผ ์ํด ๋ค์๊ณผ ๊ฐ์ ์์ผ๋ก ํํ๋ฉ๋๋ค:\n\n$$\\frac{\\partial \\psi}{\\partial t} = iH \\nabla^2 \\psi + V(x)\\psi $$\n\n์ฌ๊ธฐ์ $\\psi$๋ ํ๋ํจ์์ด๊ณ $i$๋ ํ์ ๋จ์์
๋๋ค. ์ฌ๊ธฐ์ $t$๋ ์๊ฐ, $x$๋ ๊ณต๊ฐ ์ขํ์ด๋ฉฐ, $H$๋ ํด๋ฐํด ์์๋ก ์์คํ
์ ์๋์ง๋ฅผ ๋ํ๋
๋๋ค. ๋ํ $V(x)$๋ ์ธ๋ถ ํ์ด๋ ์ฅ๋ฒฝ์ ์ํด ์ํฅ์ ๋ฐ๋ ๋ถ๋ถ์ ๋ํ๋ด๋ ํจ์๋ก, ์ผ๋ฐ์ ์ผ๋ก ์ ์์ฅ์ ์ฌ์ฉํฉ๋๋ค.\n\n**2. ์์์ญํ๊ณผ ์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์์ ๊ด๊ณ**\n\n์์์ญํ์์๋ ์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์์ด ๋งค์ฐ ์ค์ํ ์ญํ ์ ํฉ๋๋ค. ์ด๋ ๋ชจ๋ ๋ฌผ๋ฆฌ์ ์์คํ
์ด ๋ถํ์ ์ฑ ์๋ฆฌ์ ๋ฐ๋ผ ํ๋์ ํ๋ฉฐ, ์ด๋ฌํ ์์คํ
๋ค์ ํ๋ฅ ์ ์ผ๋ก ์ํ๋ฅผ ๊ฐ์ง ์๋ฐ์ ์๊ธฐ ๋๋ฌธ์
๋๋ค. ๋ฐ๋ผ์ ์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์์ ์์์ญํ์ ์ํ์ ์ผ๋ก ๋ชจ๋ธ๋งํ๋ ํต์ฌ์ ์ธ ๋๊ตฌ ์ค ํ๋์
๋๋ค.\n\n์๋ฅผ ๋ค์ด, ์์ํต ๋ด์ ์ ์๋ค์ ์ํ๋ ์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์์ ์ํด ๊ฒฐ์ ๋๋ฉฐ, ์ด๋ ๋ฌผ๋ฆฌํ์ ๋ฒ์น์ ๋ฐ๋ฅด๋ ๊ฒ์ผ๋ก ๋ณด์
๋๋ค. ๋ํ, ๊ด์ ํจ๊ณผ์์๋ ์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์์ ๋น์ด ๋ฌผ์ง ๋ด์์ ์ด๋ป๊ฒ ํก์๋๊ณ ๋ฐ์ฌ๋๋์ง๋ฅผ ์์ธกํ๋๋ฐ ์ฌ์ฉ๋ฉ๋๋ค.\n\n**3. ์์ฉ ๋ถ์ผ**\n\n์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์์ ๋ค์ํ ๋ถ์ผ์์ ํ์ฉ๋๊ณ ์์ต๋๋ค. ์๋ฅผ ๋ค๋ฉด, ๋ฐ๋์ฒด ๊ธฐ์ ์์์ ํธ๋์ง์คํฐ ์ค๊ณ, ํต๋ฌผ๋ฆฌํ์์์ ๋ฐฉ์ฌ์ฑ ๋ถ๊ดด ์ฐ๊ตฌ ๋ฑ์ด ์์ผ๋ฉฐ, ์ด๋ ๋ชจ๋ ์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์์ ๊ธฐ๋ฐ์ผ๋ก ํ ์ด๋ก ์ ๊ธฐ๋ฐ ์์์ ์ด๋ฃจ์ด์ง๋๋ค.\n\n๋ํ, ํ๋ ๊ณผํ ๊ธฐ์ ์ ๋ฐ์ ์๋ ํฐ ๊ธฐ์ฌ๋ฅผ ํ๊ณ ์๋๋ฐ, ํนํ ์ธ๊ณต์ง๋ฅ(AI), ์ปดํจํฐ ์๋ฎฌ๋ ์ด์
๋ฑ์์ ๋ณต์กํ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๊ณ ์๋ก์ด ์ง์์ ์ฐฝ์ถํ๊ธฐ ์ํ ๊ธฐ์ด๊ฐ ๋๊ณ ์์ต๋๋ค.\n\n๊ฒฐ๋ก ์ ์ผ๋ก, ์๋ขฐ๋ฉ๊ฑฐ ๋ฐฉ์ ์์ ์์์ญํ์ ๊ธฐ๋ณธ ๊ฐ๋
๋ค์ ์ดํดํ๊ณ ํด์ํ๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ๋ก์ ๋ง์ ํ์ ์ ์ด๊ณ ์ค์ฉ์ ์ธ ๊ธฐ์ ์ ๊ฐ๋ฅํ๊ฒ ํ์ต๋๋ค. ์ด๋ ์์์ญํ์ ์ค์์ฑ์ ๋ณด์ฌ์ฃผ๋ ๋ํ์ ์ธ ์์๋ผ๊ณ ํ ์ ์์ต๋๋ค.<|im_end|><|endofturn|>']
๐ Documentation
Overview
HyperCLOVAX-SEED-Text-Instruct-0.5B is a Text-to-Text model with instruction-following capabilities that excels in understanding Korean language and culture. Compared to external competitors of similar scale, it demonstrates improved mathematical performance and a substantial enhancement in Korean language capability. It supports a maximum context length of 4K and functions as a versatile small model applicable to a wide range of tasks.
Basic Information
Property |
Details |
Model Type |
Transformerโbased (Dense Model) |
Parameters |
0.57 B (total); 0.45 B (excluding token embeddings, tied embeddings) |
Input/Output Format |
Text / Text |
Maximum Context Length |
4 K tokens |
Knowledge Cutoff Date |
Trained on data up to January 2025 |
Training and Data
The training dataset for HyperCLOVAX-SEED-Text-Instruct-0.5B consists of diverse sources, including the highโquality data accumulated during the development of HyperCLOVAX-SEED-Text-Instruct-0.5B. Training was conducted in three main stages:
- Pretraining: Knowledge acquisition using highโquality data and a highโperformance pretrained model.
- Rejection Sampling FineโTuning (RFT): Enhancement of multiโdomain knowledge and complex reasoning capabilities.
- Supervised FineโTuning (SFT): Improvement of instructionโfollowing proficiency.
Training Cost
HyperCLOVAX-SEED-Text-Instruct-0.5B leveraged HyperCLOVA Xโs lightweight training process and highโquality data to achieve significantly lower training costs compared to industryโleading competitors of similar scale. Excluding the SFT stage, a single pretraining run incurred:
Pretraining Cost Category |
HyperCLOVAX-SEED-Text-Instruct-0.5B |
QWEN2.5โ0.5Bโinstruct |
A100 GPU Hours |
4.358 K |
169.257 K |
Cost (USD) |
6.537 K |
253.886 K |
This represents approximately a 39ร reduction in pretraining cost relative to QWEN2.5โ0.5B-instruct
.
Benchmarks
Model |
KMMLU (5-shot, acc) |
HAE-RAE (5-shot, acc) |
CLiCK (5-shot, acc) |
KoBEST (5-shot, acc) |
HyperCLOVAX-SEED-Text-Base-0.5B |
0.4181 |
0.6370 |
0.5373 |
0.6963 |
HyperCLOVAX-SEED-Text-Instruct-0.5B |
0.3815 |
0.5619 |
0.4446 |
0.6299 |
QWEN2.5-0.5B-instruct |
0.2968 |
0.3428 |
0.3805 |
0.5025 |
๐ License
The model is licensed under the hyperclovax-seed
license. You can find more details in the LICENSE file.