A.X 4.0 Light Open-Source Large Language Model - Free Deployment, Optimized for Korean Comprehension and Enterprise Applications

A.X 4.0 Light Gguf

Developed by mykor

A.X 4.0 Light is a lightweight large language model developed by SKT AI Model Lab, built on Qwen2.5 and optimized for Korean understanding and enterprise deployment.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Korean optimization #Enterprise-level deployment #Long context processing

Downloads 535

Release Time : 7/4/2025

Model Overview

A.X 4.0 Light is a lightweight large language model focusing on Korean understanding and efficient deployment, supporting long context processing and suitable for various benchmark tasks.

Model Features

Excellent Korean proficiency

Scored 78.3 in the Korean evaluation benchmark KMMLU, exceeding GPT - 4o.

In - depth cultural understanding

Scored 83.5 in the Korean culture and context understanding benchmark CLIcK, surpassing GPT - 4o.

Efficient token usage

For the same Korean input, it uses approximately 33% fewer tokens than GPT - 4o.

Flexible deployment methods

Offers a standard model with 72B parameters and a 7B lightweight version.

Long context processing ability

Supports up to 131,072 tokens (the lightweight model supports 16,384 tokens).

Model Capabilities

Korean text generation

English text generation

Korean translation

Tool invocation

Long context understanding

Use Cases

Translation

English to Korean translation

Translate English sentences into Korean.

On April 12, 1961, the first human went into space and orbited the Earth.

Question - answering

Suggested air - conditioning temperature in summer

Provide suggestions on the appropriate air - conditioning temperature in summer.

The appropriate air - conditioning temperature in summer is generally 24 - 26 degrees.

Tool invocation

Discount calculation

Calculate the discounted price based on the original price and the discount rate.

When a 15% discount is applied to a product priced at 57,600 won, the discounted price is 48,960 won.

🚀 A.X 4.0 Light

A.X 4.0 Light is a lightweight large language model optimized for Korean - language understanding and enterprise deployment. It offers high - performance language processing capabilities and is suitable for various business scenarios.

✨ Features

Superior Korean Proficiency: Achieved a score of 78.3 on [KMMLU](https://huggingface.co/datasets/HAERAE - HUB/KMMLU), outperforming GPT - 4o (72.5).
Deep Cultural Understanding: Scored 83.5 on CLIcK, surpassing GPT - 4o (80.2).
Efficient Token Usage: Uses approximately 33% fewer tokens than GPT - 4o for the same Korean input.
Deployment Flexibility: Available in both a 72B - parameter standard model (A.X 4.0) and a 7B lightweight version (A.X 4.0 Light).
Long Context Handling: Supports up to 131,072 tokens (Lightweight model supports up to 16,384 tokens length).

📊 Performance

Model Performance

Benchmarks	A.X 4.0	Qwen3 - 235B - A22B (w/o reasoning)	Qwen2.5 - 72B	GPT - 4o
Knowledge - KMMLU	78.32	73.64	66.44	72.51
Knowledge - CLIcK	83.51	74.55	72.59	80.22
Knowledge - KoBALT	47.30	41.57	37.00	44.00
Knowledge - MMLU	86.62	87.37	85.70	88.70
General - Ko - MT - Bench	86.69	88.00	82.69	88.44
General - MT - Bench	83.25	86.56	93.50	88.19
General - LiveBench^2024.11	52.30	64.50	54.20	52.19
Instruction Following - Ko - IFEval	77.96	77.53	77.07	75.38
Instruction Following - IFEval	86.05	85.77	86.54	83.86
Math - HRM8K	48.55	54.52	46.37	43.27
Math - MATH	74.28	72.72	77.00	72.38
Code - HumanEval+	79.27	79.27	81.71	86.00
Code - MBPP+	73.28	70.11	75.66	75.10
Code - LiveCodeBench^{2024.10~2025.04}	26.07	33.09	27.58	29.30
Long Context - LongBench^<128K	56.70	49.40	45.60	47.50
Tool - use - FunctionChatBench	85.96	82.43	88.30	95.70

Lightweight Model Performance

Benchmarks	A.X 4.0 Light	Qwen3 - 8B (w/o reasoning)	Qwen2.5 - 7B	EXAONE - 3.5 - 7.8B	Kanana - 1.5 - 8B
Knowledge - KMMLU	64.15	63.53	49.56	53.76	48.28
Knowledge - CLIcK	68.05	62.71	60.56	64.30	61.30
Knowledge - KoBALT	30.29	26.57	21.57	21.71	23.14
Knowledge - MMLU	75.43	82.89	75.40	72.20	68.82
General - Ko - MT - Bench	79.50	64.06	61.31	81.06	76.30
General - MT - Bench	81.56	65.69	79.37	83.50	77.60
General - LiveBench	37.10	50.20	37.00	40.20	29.40
Instruction Following - Ko - IFEval	72.99	73.39	60.73	65.01	69.96
Instruction Following - IFEval	84.68	85.38	76.73	82.61	80.11
Math - HRM8K	40.12	52.50	35.13	31.88	30.87
Math - MATH	68.88	71.48	65.58	63.20	59.28
Code - HumanEval+	75.61	77.44	74.39	76.83	76.83
Code - MBPP+	67.20	62.17	68.50	64.29	67.99
Code - LiveCodeBench	18.03	23.93	16.62	17.98	16.52

🚀 Quick Start

with HuggingFace Transformers

transformers>=4.46.0 or the latest version is required to use skt/A.X - 4.0 - Light

pip install transformers>=4.46.0

Example Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "skt/A.X-4.0-Light"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "system", "content": "당신은 사용자가 제공하는 영어 문장들을 한국어로 번역하는 AI 전문가입니다."},
    {"role": "user", "content": "The first human went into space and orbited the Earth on April 12, 1961."},
]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(
        input_ids,
        max_new_tokens=128,
        do_sample=False,
    )

len_input_prompt = len(input_ids[0])
response = tokenizer.decode(output[0][len_input_prompt:], skip_special_tokens=True)
print(response)
# Output:
# 1961년 4월 12일, 최초의 인간이 우주로 나가 지구를 공전했습니다.

with vLLM

vllm>=v0.6.4.post1 or the latest version is required to use tool - use function

pip install vllm>=v0.6.4.post1
# if you don't want to activate tool-use function, just commenting out below vLLM option
VLLM_OPTION="--enable-auto-tool-choice --tool-call-parser hermes"
vllm serve skt/A.X-4.0-Light $VLLM_OPTION

Example Usage

from openai import OpenAI

def call(messages, model):
    completion = client.chat.completions.create(
        model=model,
        messages=messages,
    )
    print(completion.choices[0].message)

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="api_key"
)
model = "skt/A.X-4.0-Light"
messages = [{"role": "user", "content": "에어컨 여름철 적정 온도는? 한줄로 답변해줘"}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='여름철 적정 에어컨 온도는 일반적으로 24-26도입니다.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)

messages = [{"role": "user", "content": "What is the appropriate temperature for air conditioning in summer? Response in a single sentence."}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='The appropriate temperature for air conditioning in summer generally ranges from 72°F to 78°F (22°C to 26°C) for comfort and energy efficiency.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)

Examples for tool - use

from openai import OpenAI


def call(messages, model):
    completion = client.chat.completions.create(
        model=model,
        messages=messages,
        tools=tools
    )
    print(completion.choices[0].message)


client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="api_key"
)
model = "skt/A.X-4.0-Light"

calculate_discount = {
    "type": "function",
    "function": {
        "name": "calculate_discount",
        "description": "원가격과 할인율(퍼센트 단위)을 입력받아 할인된 가격을계산한다.",
        "parameters": {
            "type": "object",
            "properties": {
                "original_price": {
                    "type": "number",
                    "description": "상품의 원래 가격"
                },
                "discount_percentage": {
                    "type": "number",
                    "description": "적용할 할인율(예: 20% 할인의 경우 20을 입력)"
                }
            },
            "required": ["original_price", "discount_percentage"]
        }
    }
}
get_exchange_rate = {
    "type": "function",
    "function": {
        "name": "get_exchange_rate",
        "description": "두 통화 간의 환율을 가져온다.",
        "parameters": {
            "type": "object",
            "properties": {
                "base_currency": {
                    "type": "string",
                    "description": "The currency to convert from."
                },
                "target_currency": {
                    "type": "string",
                    "description": "The currency to convert to."
                }
            },
            "required": ["base_currency", "target_currency"]
        }
    }
}
tools = [calculate_discount, get_exchange_rate]

### Slot filling ###
messages = [{"role": "user", "content": "우리가 뭘 사야되는데 원래 57600원인데 직원할인 받을 수 있거든? 할인가좀 계산해줘"}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='할인율을 알려주시겠습니까?', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)


### Function calling ###
messages = [
    {"role": "user", "content": "우리가 뭘 사야되는데 원래 57600원인데 직원할인 받을 수 있거든? 할인가좀 계산해줘"},
    {"role": "assistant", "content": "할인율을 알려주시겠습니까?"},
    {"role": "user", "content": "15% 할인 받을 수 있어."},
]
call(messages, model)
# Output: 
# ChatCompletionMessage(content=None, refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-7778d1d9fca94bf2acbb44c79359502c', function=Function(arguments='{"original_price": 57600, "discount_percentage": 15}', name='calculate_discount'), type='function')], reasoning_content=None)


### Completion ###
messages = [
    {"role": "user", "content": "우리가 뭘 사야되는데 원래 57600원인데 직원할인 받을 수 있거든? 할인가좀 계산해줘"},
    {"role": "assistant", "content": "할인율을 알려주시겠습니까?"},
    {"role": "user", "content": "15% 할인 받을 수 있어."},
    {"role": "tool", "tool_call_id": "random_id", "name": "calculate_discount", "content": "{\"original_price\": 57600, \"discount_percentage\": 15, \"discounted_price\": 48960.0}"}
]
call(messages, model)
# Output: 
# ChatCompletionMessage(content='57600원의 상품에서 15% 할인을 적용하면, 할인된 가격은 48960원입니다.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)

📄 License

The A.X 4.0 Light models are licensed under Apache License 2.0.

📖 Citation

@article{SKTAdotX4Light,
  title={A.X 4.0 Light},
  author={SKT AI Model Lab},
  year={2025},
  url={https://huggingface.co/skt/A.X-4.0-Light}
}

📞 Contact

Business & Partnership Contact: a.x@sk.com

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご