Llama-DNA-1.0-8B-Instruct Open-source Bilingual Model - Optimized for Korean Comprehension and Generation, with Equally Strong English Capabilities

Llama DNA 1.0 8B Instruct

Developed by dnotitia

A state-of-the-art bilingual language model based on the Llama architecture, specially optimized for Korean understanding and generation while maintaining strong English capabilities.

Large Language Model

Transformers

Supports Multiple Languages#Korean optimization #Knowledge distillation #Bilingual dialogue

Downloads 661

Release Time : 12/6/2024

Model Overview

The DNA 1.0 8B Instruct Model was developed through a complex model merging process, including spherical linear interpolation (SLERP) with the Llama 3.1 8B Instruct Model and knowledge distillation (KD) using Llama 3.1 405B as the teacher model. It underwent extensive training via continual pre-training (CPT) on high-quality Korean datasets and completed the training process with supervised fine-tuning (SFT) and direct preference optimization (DPO).

Model Features

Optimized Korean Capabilities

Specially optimized for Korean understanding and generation while maintaining strong English capabilities.

Advanced Training Methods

Utilizes various advanced training techniques including spherical linear interpolation (SLERP), knowledge distillation (KD), continual pre-training (CPT), supervised fine-tuning (SFT), and direct preference optimization (DPO).

Long Context Support

Supports long context processing of up to 131,072 tokens (128k).

Human Preference Alignment

Outputs are more aligned with human preferences through the direct preference optimization (DPO) training process.

Model Capabilities

Korean text generation

English text generation

Multi-turn dialogue

Complex instruction understanding

Knowledge Q&A

Use Cases

Intelligent Assistants

Korean Chatbot

Intelligent conversational assistant for Korean environments

Excellent performance on Korean benchmarks such as KMMLU and KoBEST

Education

Language Learning Assistant

Helps learners practice Korean and English

Business Applications

Bilingual Customer Service System

Handles customer inquiries in Korean and English

🚀 DNA 1.0 8B Instruct

DNA 1.0 8B Instruct is a state - of - the - art (SOTA) bilingual language model based on the Llama architecture. It is specifically optimized for Korean language understanding and generation while maintaining strong English capabilities. This model was developed through a complex process, including model merging via spherical linear interpolation (SLERP) with Llama 3.1 8B Instruct, knowledge distillation (KD) using Llama 3.1 405B as the teacher model, continual pre - training (CPT) with a high - quality Korean dataset, and supervised fine - tuning (SFT) and direct preference optimization (DPO) to align with human preferences and enhance instruction - following abilities.

🚀 Quick Start

This model requires transformers >= 4.43.0.

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

tokenizer = AutoTokenizer.from_pretrained('dnotitia/Llama-DNA-1.0-8B-Instruct')
model = AutoModelForCausalLM.from_pretrained('dnotitia/Llama-DNA-1.0-8B-Instruct', device_map='auto')
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

conversation = [
    {"role": "system", "content": "You are a helpful assistant, Dnotitia DNA."},
    {"role": "user", "content": "너의 이름은?"},
]
inputs = tokenizer.apply_chat_template(conversation,
                                       add_generation_prompt=True,
                                       return_dict=True,
                                       return_tensors="pt").to(model.device)
_ = model.generate(**inputs, streamer=streamer)

✨ Features

Bilingual Excellence: Optimized for Korean while maintaining strong English capabilities.
Sophisticated Development Process: Involves model merging, knowledge distillation, continual pre - training, supervised fine - tuning, and direct preference optimization.
Extensive Instruction Tuning: Fine - tuned on approximately 7B tokens of curated data to follow complex instructions and engage in natural conversations.

📦 Information

Property	Details
Model Type	DNA 1.0 8B Instruct (bilingual language model based on Llama architecture)
Developed by	Dnotitia Inc.
Supported Languages	Korean, English
Model Release Date	Dec 10, 2024
Vocab Size	128,256
Context Length	131,072 tokens (128k)
License	CC BY - NC 4.0

📚 Documentation

Evaluation

We evaluated DNA 1.0 8B Instruct against other prominent language models of similar size across various benchmarks, including Korean - specific tasks and general language understanding metrics.

Language	Benchmark	dnotitia/Llama-DNA-1.0-8B-Instruct	LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct	LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct	yanolja/EEVE-Korean-Instruct-10.8B-v1.0	Qwen/Qwen2.5-7B-Instruct	meta-llama/Llama-3.1-8B-Instruct	mistralai/Mistral-7B-Instruct-v0.3	NCSOFT/Llama-VARCO-8B-Instruct	upstage/SOLAR-10.7B-Instruct-v1.0
Korean	KMMLU	53.26 (1st)	45.30	45.28	42.17	45.66	41.66	31.45	38.49	41.50
	KMMLU-hard	29.46 (1st)	23.17	20.78	19.25	24.78	20.49	17.86	19.83	20.61
	KoBEST	83.40 (1st)	79.05	80.13	81.67	78.51	67.56	63.77	72.99	73.26
	Belebele	57.99 (1st)	40.97	45.11	49.40	54.85	54.70	40.31	53.17	48.68
	CSATQA	43.32 (2nd)	40.11	34.76	39.57	45.45	36.90	27.27	32.62	34.22
English	MMLU	66.64 (3rd)	65.27	64.32	63.63	74.26	68.26	62.04	63.25	65.30
	MMLU-Pro	43.05 (1st)	40.73	38.90	32.79	42.5	40.92	33.49	37.11	30.25
	GSM8K	80.52 (1st)	65.96	80.06	56.18	75.74	75.82	49.66	64.14	69.22

The highest scores are in bold form, and the second - highest scores are underlined.

Evaluation Protocol
For easy reproduction of our evaluation results, we list the evaluation tools and settings used below:

	Evaluation setting	Metric	Evaluation tool
KMMLU	5 - shot	macro_avg / exact_match	lm - eval - harness
KMMLU Hard	5 - shot	macro_avg / exact_match	lm - eval - harness
KoBEST	5 - shot	macro_avg / f1	lm - eval - harness
Belebele	0 - shot	acc	lm - eval - harness
CSATQA	0 - shot	acc_norm	lm - eval - harness
MMLU	5 - shot	macro_avg / acc	lm - eval - harness
MMLU Pro	5 - shot	macro_avg / exact_match	lm - eval - harness
GSM8K	5 - shot	acc, exact_match & strict_extract	lm - eval - harness

Limitations

While DNA 1.0 8B Instruct demonstrates strong performance, users should be aware of the following limitations:

The model may occasionally generate biased or inappropriate content.
Responses are based on training data and may not reflect current information.
The model may sometimes produce factually incorrect or inconsistent answers.
Performance may vary depending on the complexity and domain of the task.
Generated content should be reviewed for accuracy and appropriateness.

Appendix

KMMLU scores comparison chart:
DNA 1.0 8B Instruct model architecture ¹:

The median percentage of model’s weight difference between before and after the merge (our SFT model + Llama 3.1 8B Instruct):

📄 License

This model is released under CC BY - NC 4.0 license. For commercial usage inquiries, please Contact us.

📖 Citation

If you use or discuss this model in your academic research, please cite the project to help spread awareness:

@misc{lee2025dna10technicalreport,
      title={DNA 1.0 Technical Report}, 
      author={Jungyup Lee and Jemin Kim and Sang Park and SeungJae Lee},
      year={2025},
      eprint={2501.10648},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.10648}, 
}

⚠️ Important Note

The model may generate biased, inappropriate, factually incorrect or inconsistent content. Generated content should be reviewed for accuracy and appropriateness.

💡 Usage Tip

For commercial usage of the model, please Contact us through the provided link.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご