Kanana 1.5-8b-base Open-source Bilingual Large Model - Free Deployment, Strong in Programming and Mathematics, Excellent in Long-text Processing

Kanana 1.5 8b Base

Developed by kakaocorp

Kanana 1.5 is a bilingual large language model developed by Kakao Corporation, supporting English and Korean, with significant improvements in programming, mathematics, and function calling capabilities, natively supporting a 32K tokens context length

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #32K long text processing #Bilingual programming enhancement #Function calling optimization

Downloads 432

Release Time : 4/15/2025

Model Overview

Kanana 1.5 is the new version of the Kanana model family, optimized for programming, mathematics, and function calling capabilities, supporting long text processing, suitable for complex scenarios

Model Features

Enhanced programming and mathematical capabilities

Significant improvements in programming and mathematical tasks compared to the previous generation model

Long text processing

Natively supports 32K tokens context length, extendable to 128K tokens via YaRN technology

Bilingual support

Supports both English and Korean processing

Optimized post-training process

Achieves more natural and precise conversational interactions

Model Capabilities

Text generation

Code generation

Mathematical reasoning

Long document processing

Bilingual understanding

Use Cases

Programming assistance

Code generation

Generates code based on natural language descriptions

HumanEval test score 61.59

Code completion

Assists developers in completing code snippets

MBPP test score 57.80

Mathematical applications

Mathematical problem solving

Solves complex mathematical problems

GSM8K test score 63.53

Long document processing

Document summarization

Processes documents up to 32K tokens and generates summaries

🚀 Kanana 1.5 - 8B Base

Kanana 1.5 is a new version of the Kanana model family, offering significant improvements in coding, mathematics, and function calling capabilities. It can handle up to 32K tokens natively and up to 128K tokens with YaRN, and provides more natural and accurate conversations through refined post - training.

🚀 Quick Start

This section provides an overview of the Kanana 1.5 model and its features. For more detailed information, please refer to the corresponding sections below.

✨ Features

Enhanced Capabilities: Kanana 1.5 shows substantial improvements in coding, mathematics, and function calling capabilities compared to the previous version.
Long - Sequence Handling: It can natively handle up to 32K tokens and up to 128K tokens using YaRN, maintaining coherence in long - document processing and extended conversations.
Refined Conversations: Through a refined post - training process, it delivers more natural and accurate conversations.

📚 Documentation

News

2025/05/23: Published a blog post about Kanana 1.5 models and released HF model weights.
2025/02/27: Released Technical Report and HF model weights.
2025/01/10: Published a blog post about the development of Kanana Nano model.
2024/11/14: Published blog posts (pre - training, post - training) about the development of Kanana models.
2024/11/06: Published a presentation video about the development of the Kanana models.

Kanana 1.5
- Performance
  - Base Model Evaluation
  - Instruct Model Evaluation
- Processing 32K+ Length
Contributors
Citation
Contact

Kanana 1.5

Kanana 1.5, a newly introduced version of the Kanana model family, presents substantial enhancements in coding, mathematics, and function calling capabilities over the previous version, enabling broader application to more complex real - world problems. This new version now can handle up to 32K tokens length natively and up to 128K tokens using YaRN, allowing the model to maintain coherence when handling extensive documents or engaging in extended conversations. Furthermore, Kanana 1.5 delivers more natural and accurate conversations through a refined post - training process.

⚠️ Important Note

Neither the pre - training nor the post - training data includes Kakao user data.

Performance

Base Model Evaluation

Models	MMLU	KMMLU	HAERAE	HumanEval	MBPP	GSM8K
Kanana - 1.5 - 8B	64.24	48.94	82.77	61.59	57.80	63.53
Kanana - 8B	64.22	48.30	83.41	40.24	51.40	57.09

Instruct Model Evaluation

Models	MT - Bench	KoMT - Bench	IFEval	HumanEval+	MBPP+	GSM8K (0 - shot)	MATH	MMLU (0 - shot, CoT)	KMMLU (0 - shot, CoT)	FunctionChatBench
Kanana - 1.5 - 8B*	7.76	7.63	80.11	76.83	67.99	87.64	67.54	68.82	48.28	58.00
Kanana - 8B	7.13	6.92	76.91	62.20	43.92	79.23	37.68	66.50	47.43	17.37

⚠️ Important Note

* Models released under Apache 2.0 are trained on the latest versions compared to other models.

Processing 32K+ Length

Currently, the config.json uploaded to HuggingFace is configured for token lengths of 32,768 or less. To process tokens beyond this length, YaRN must be applied. By updating the config.json with the following parameters, you can apply YaRN to handle token sequences up to 128K in length:

"rope_scaling": {
    "factor": 4.4,
    "original_max_position_embeddings": 32768,
    "type": "yarn",
    "beta_fast": 64,
    "beta_slow": 2
},

Contributors

Language Model Training: Yunju Bak, Doohae Jung, Boseop Kim, Nayeon Kim, Hojin Lee, Jaesun Park, Minho Ryu
Language Model Alignment: Jiyeon Ham, Seungjae Jung, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Daniel Wontae Nam
AI Engineering: Youmin Kim, Hyeongju Kim

Citation

@misc{kananallmteam2025kananacomputeefficientbilinguallanguage,
      title={Kanana: Compute - efficient Bilingual Language Models}, 
      author={Kanana LLM Team and Yunju Bak and Hojin Lee and Minho Ryu and Jiyeon Ham and Seungjae Jung and Daniel Wontae Nam and Taegyeong Eo and Donghun Lee and Doohae Jung and Boseop Kim and Nayeon Kim and Jaesun Park and Hyunho Kim and Hyunwoong Ko and Changmin Lee and Kyoung - Woon On and Seulye Baeg and Junrae Cho and Sunghee Jung and Jieun Kang and EungGyun Kim and Eunhwa Kim and Byeongil Ko and Daniel Lee and Minchul Lee and Miok Lee and Shinbok Lee and Gaeun Seo},
      year={2025},
      eprint={2502.18934},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.18934}, 
}

Contact

Kanana LLM Team Technical Support: kanana - llm@kakaocorp.com
Business & Partnership Contact: alpha.k@kakaocorp.com

📄 License

This project is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご