๐ DNA 1.0 8B Instruct
DNA 1.0 8B Instruct is a state - of - the - art (SOTA) bilingual language model based on the Llama architecture. It is specifically optimized for Korean language understanding and generation while maintaining strong English capabilities. This model was developed through a complex process, including model merging via spherical linear interpolation (SLERP) with Llama 3.1 8B Instruct, knowledge distillation (KD) using Llama 3.1 405B as the teacher model, continual pre - training (CPT) with a high - quality Korean dataset, and supervised fine - tuning (SFT) and direct preference optimization (DPO) to align with human preferences and enhance instruction - following abilities.
๐ Quick Start
This model requires transformers >= 4.43.0
.
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
tokenizer = AutoTokenizer.from_pretrained('dnotitia/Llama-DNA-1.0-8B-Instruct')
model = AutoModelForCausalLM.from_pretrained('dnotitia/Llama-DNA-1.0-8B-Instruct', device_map='auto')
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
conversation = [
{"role": "system", "content": "You are a helpful assistant, Dnotitia DNA."},
{"role": "user", "content": "๋์ ์ด๋ฆ์?"},
]
inputs = tokenizer.apply_chat_template(conversation,
add_generation_prompt=True,
return_dict=True,
return_tensors="pt").to(model.device)
_ = model.generate(**inputs, streamer=streamer)
โจ Features
- Bilingual Excellence: Optimized for Korean while maintaining strong English capabilities.
- Sophisticated Development Process: Involves model merging, knowledge distillation, continual pre - training, supervised fine - tuning, and direct preference optimization.
- Extensive Instruction Tuning: Fine - tuned on approximately 7B tokens of curated data to follow complex instructions and engage in natural conversations.
๐ฆ Information
Property |
Details |
Model Type |
DNA 1.0 8B Instruct (bilingual language model based on Llama architecture) |
Developed by |
Dnotitia Inc. |
Supported Languages |
Korean, English |
Model Release Date |
Dec 10, 2024 |
Vocab Size |
128,256 |
Context Length |
131,072 tokens (128k) |
License |
CC BY - NC 4.0 |
๐ Documentation
Evaluation
We evaluated DNA 1.0 8B Instruct against other prominent language models of similar size across various benchmarks, including Korean - specific tasks and general language understanding metrics.
Language |
Benchmark |
dnotitia/Llama-DNA-1.0-8B-Instruct |
LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct |
LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct |
yanolja/EEVE-Korean-Instruct-10.8B-v1.0 |
Qwen/Qwen2.5-7B-Instruct |
meta-llama/Llama-3.1-8B-Instruct |
mistralai/Mistral-7B-Instruct-v0.3 |
NCSOFT/Llama-VARCO-8B-Instruct |
upstage/SOLAR-10.7B-Instruct-v1.0 |
Korean |
KMMLU |
53.26 (1st) |
45.30 |
45.28 |
42.17 |
45.66 |
41.66 |
31.45 |
38.49 |
41.50 |
|
KMMLU-hard |
29.46 (1st) |
23.17 |
20.78 |
19.25 |
24.78 |
20.49 |
17.86 |
19.83 |
20.61 |
|
KoBEST |
83.40 (1st) |
79.05 |
80.13 |
81.67 |
78.51 |
67.56 |
63.77 |
72.99 |
73.26 |
|
Belebele |
57.99 (1st) |
40.97 |
45.11 |
49.40 |
54.85 |
54.70 |
40.31 |
53.17 |
48.68 |
|
CSATQA |
43.32 (2nd) |
40.11 |
34.76 |
39.57 |
45.45 |
36.90 |
27.27 |
32.62 |
34.22 |
English |
MMLU |
66.64 (3rd) |
65.27 |
64.32 |
63.63 |
74.26 |
68.26 |
62.04 |
63.25 |
65.30 |
|
MMLU-Pro |
43.05 (1st) |
40.73 |
38.90 |
32.79 |
42.5 |
40.92 |
33.49 |
37.11 |
30.25 |
|
GSM8K |
80.52 (1st) |
65.96 |
80.06 |
56.18 |
75.74 |
75.82 |
49.66 |
64.14 |
69.22 |
- The highest scores are in bold form, and the second - highest scores are underlined.
Evaluation Protocol
For easy reproduction of our evaluation results, we list the evaluation tools and settings used below:
|
Evaluation setting |
Metric |
Evaluation tool |
KMMLU |
5 - shot |
macro_avg / exact_match |
lm - eval - harness |
KMMLU Hard |
5 - shot |
macro_avg / exact_match |
lm - eval - harness |
KoBEST |
5 - shot |
macro_avg / f1 |
lm - eval - harness |
Belebele |
0 - shot |
acc |
lm - eval - harness |
CSATQA |
0 - shot |
acc_norm |
lm - eval - harness |
MMLU |
5 - shot |
macro_avg / acc |
lm - eval - harness |
MMLU Pro |
5 - shot |
macro_avg / exact_match |
lm - eval - harness |
GSM8K |
5 - shot |
acc, exact_match & strict_extract |
lm - eval - harness |
Limitations
While DNA 1.0 8B Instruct demonstrates strong performance, users should be aware of the following limitations:
- The model may occasionally generate biased or inappropriate content.
- Responses are based on training data and may not reflect current information.
- The model may sometimes produce factually incorrect or inconsistent answers.
- Performance may vary depending on the complexity and domain of the task.
- Generated content should be reviewed for accuracy and appropriateness.
Appendix
- The median percentage of modelโs weight difference between before and after the merge (our SFT model + Llama 3.1 8B Instruct):

๐ License
This model is released under CC BY - NC 4.0 license. For commercial usage inquiries, please Contact us.
๐ Citation
If you use or discuss this model in your academic research, please cite the project to help spread awareness:
@misc{lee2025dna10technicalreport,
title={DNA 1.0 Technical Report},
author={Jungyup Lee and Jemin Kim and Sang Park and SeungJae Lee},
year={2025},
eprint={2501.10648},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.10648},
}
โ ๏ธ Important Note
The model may generate biased, inappropriate, factually incorrect or inconsistent content. Generated content should be reviewed for accuracy and appropriateness.
๐ก Usage Tip
For commercial usage of the model, please Contact us through the provided link.