ð Kanana
Kanana is a series of bilingual language models developed by Kakao. It shows excellent performance in Korean and competitive performance in English, with significantly lower computational costs compared to similar - sized state - of - the - art models.
ð Quick Start
The text doesn't provide specific quick - start steps. However, you can refer to the Technical Report and the HF model weights for more information.
ðĪ Models   |   ð Blog   |   ð Technical Report |   ðŧ Github
âĻ Features
- Bilingual Excellence: Demonstrates exceeding performance in Korean and competitive performance in English.
- Low Computational Cost: Significantly lower computational cost than state - of - the - art models of similar size.
- Advanced Techniques: Employs high - quality data filtering, staged pre - training, depth up - scaling, pruning and distillation during pre - training, and supervised fine - tuning and preference optimization during post - training.
- Scenario Adaptation: Supports approaches for language model adaptation to specific scenarios, such as embedding, function calling, and Retrieval Augmented Generation (RAG).
â ïļ Important Note
Neither the pre - training nor the post - training data includes Kakao user data.
ð Documentation
Table of Contents
News
Performance
Below are partial reports on the performance of the Kanana
model series. Please refer to the Technical Report for the full results.
Pre - trained Model Performance
Models |
MMLU |
KMMLU |
HAERAE |
HumanEval |
MBPP |
GSM8K |
27b+ scale |
|
|
|
|
|
|
Kanana - Flag - 32.5b |
77.68 |
62.10 |
90.47 |
51.22 |
63.40 |
70.05 |
Qwen2.5 - 32b |
83.10 |
63.15 |
75.16 |
50.00 |
73.40 |
82.41 |
Gemma - 2 - 27b |
75.45 |
51.16 |
69.11 |
51.22 |
64.60 |
74.37 |
EXAONE - 3.5 - 32b |
72.68 |
46.36 |
82.22 |
- |
- |
- |
Aya - Expanse - 32b |
74.52 |
49.57 |
80.66 |
- |
- |
- |
7b+ scale |
|
|
|
|
|
|
Kanana - Essence - 9.8b |
67.61 |
50.57 |
84.98 |
40.24 |
53.60 |
63.61 |
Llama - 3.1 - 8b |
65.18 |
41.02 |
61.78 |
35.37 |
48.60 |
50.87 |
Qwen2.5 - 7b |
74.19 |
51.68 |
67.46 |
56.71 |
63.20 |
83.85 |
Gemma - 2 - 9b |
70.34 |
48.18 |
66.18 |
37.20 |
53.60 |
68.16 |
EXAONE - 3.5 - 7.8b |
65.36 |
45.30 |
77.54 |
- |
- |
- |
Aya - Expanse - 8b |
62.52 |
40.11 |
71.95 |
- |
- |
- |
2b+ scale |
|
|
|
|
|
|
Kanana - Nano - 2.1b |
54.83 |
44.80 |
77.09 |
31.10 |
46.20 |
46.32 |
Llama - 3.2 - 3b |
56.40 |
35.57 |
47.66 |
25.61 |
39.00 |
27.37 |
Qwen2.5 - 3b |
65.57 |
45.28 |
61.32 |
37.80 |
55.60 |
69.07 |
Gemma - 2 - 2b |
52.89 |
30.67 |
45.55 |
20.12 |
28.20 |
24.72 |
EXAONE - 3.5 - 2.4b |
59.27 |
43.58 |
69.65 |
- |
- |
- |
70b+ scale |
|
|
|
|
|
|
Llama - 3.1 - 70b |
78.93 |
53.00 |
76.35 |
57.32 |
66.60 |
81.73 |
Qwen2.5 - 72b |
86.12 |
68.57 |
80.84 |
55.49 |
76.40 |
92.04 |
Post - trained Model Performance
Instruction - following Benchmarks
Models |
MT - Bench |
LogicKor |
KoMT - Bench |
WildBench |
IFEval |
27b+ scale |
|
|
|
|
|
Kanana - Flag - 32.5b |
8.356 |
9.524 |
8.058 |
54.14 |
0.856 |
Qwen2.5 - 32b |
8.331 |
8.988 |
7.847 |
51.13 |
0.822 |
Gemma - 2 - 27b |
8.088 |
8.869 |
7.373 |
46.46 |
0.817 |
EXAONE - 3.5 - 32b |
8.375 |
9.202 |
7.907 |
54.30 |
0.845 |
Aya - Expanse - 32b |
7.788 |
8.941 |
7.626 |
48.36 |
0.735 |
7b+ scale |
|
|
|
|
|
Kanana - Essence - 9.8b |
7.769 |
8.964 |
7.706 |
47.27 |
0.799 |
Llama - 3.1 - 8b |
7.500 |
6.512 |
5.336 |
33.20 |
0.772 |
Qwen2.5 - 7b |
7.625 |
7.952 |
6.808 |
41.31 |
0.760 |
Gemma - 2 - 9b |
7.633 |
8.643 |
7.029 |
40.92 |
0.750 |
EXAONE - 3.5 - 7.8b |
8.213 |
9.357 |
8.013 |
50.98 |
0.826 |
Aya - Expanse - 8b |
7.131 |
8.357 |
7.006 |
38.50 |
0.645 |
2b+ scale |
|
|
|
|
|
Kanana - Nano - 2.1b |
6.400 |
7.964 |
5.857 |
25.41 |
0.720 |
Llama - 3.2 - 3b |
7.050 |
4.452 |
3.967 |
21.91 |
0.767 |
Qwen2.5 - 3b |
6.969 |
6.488 |
5.274 |
25.76 |
0.355 |
Gemma - 2 - 2b |
7.225 |
5.917 |
4.835 |
28.71 |
0.428 |
EXAONE - 3.5 - 2.4b |
7.919 |
8.941 |
7.223 |
41.68 |
0.790 |
70b+ scale |
|
|
|
|
|
Llama - 3.1 - 70b |
8.275 |
8.250 |
6.970 |
46.50 |
0.875 |
Qwen2.5 - 72b |
8.619 |
9.214 |
8.281 |
55.25 |
0.861 |
General Benchmarks
Models |
MMLU |
KMMLU |
HAE - RAE |
HumanEval+ |
MBPP+ |
GSM8K |
MATH |
27b+ scale |
|
|
|
|
|
|
|
Kanana - Flag - 32.5b |
81.08 |
64.19 |
68.18 |
77.44 |
69.84 |
90.83 |
57.82 |
Qwen2.5 - 32b |
84.40 |
59.37 |
48.30 |
82.32 |
71.96 |
95.30 |
81.90 |
Gemma - 2 - 27b |
78.01 |
49.98 |
46.02 |
70.12 |
70.90 |
91.05 |
53.80 |
EXAONE - 3.5 - 32b |
78.30 |
55.44 |
52.27 |
78.66 |
70.90 |
93.56 |
76.80 |
Aya - Expanse - 32b |
74.49 |
42.35 |
51.14 |
64.63 |
65.61 |
75.06 |
42.82 |
7b+ scale |
|
|
|
|
|
|
|
Kanana - Essence - 9.8b |
70.64 |
50.76 |
47.16 |
72.56 |
69.05 |
84.91 |
42.24 |
Llama - 3.1 - 8b |
71.18 |
39.24 |
40.91 |
60.98 |
57.67 |
82.71 |
49.86 |
Qwen2.5 - 7b |
77.23 |
46.87 |
37.50 |
73.78 |
70.63 |
91.58 |
75.22 |
Gemma - 2 - 9b |
73.47 |
44.47 |
39.77 |
59.76 |
64.55 |
87.72 |
48.10 |
EXAONE - 3.5 - 7.8b |
72.62 |
52.09 |
46.02 |
79.27 |
66.67 |
89.99 |
73.50 |
Aya - Expanse - 8b |
61.23 |
35.78 |
39.20 |
42.68 |
56.88 |
78.85 |
30.80 |
2b+ scale |
|
|
|
|
|
|
|
Kanana - Nano - 2.1b |
52.48 |
38.51 |
33.52 |
63.41 |
62.43 |
72.32 |
29.2 |
ð License
The Kanana project is licensed under the cc - by - nc - 4.0
license.