GreenMind-Medium-14B-R1 Open-Source Vietnamese Language Model - Free Deployment to Solve Intermediate Reasoning Problems

Greenmind Medium 14B R1

Developed by GreenNode

GreenMind-Medium-14B-R1 is a medium-scale Vietnamese language model capable of effectively solving problems requiring intermediate reasoning, such as common sense, mathematics, natural sciences, and social sciences topics.

Large Language Model

Safetensors

Supports Multiple LanguagesOpen Source License:MIT #Vietnamese reasoning optimization #Multilingual math problem solving #Structured thinking generation

Downloads 50

Release Time : 4/25/2025

Model Overview

This model is fine-tuned based on Qwen/Qwen2.5-14B-Instruct using Group Relative Policy Optimization strategy to generate logically coherent responses.

Model Features

Intermediate reasoning capability

Capable of effectively solving problems requiring intermediate reasoning, such as common sense, mathematics, natural sciences, and social sciences topics.

Logically coherent responses

Fine-tuned using Group Relative Policy Optimization strategy to generate logically coherent responses.

Multilingual support

Supports multiple languages including Vietnamese, English, Chinese, Indonesian, and Thai.

Model Capabilities

Text generation

Logical reasoning

Multilingual processing

Use Cases

Education

Math problem solving

Solving math problems, such as the chicken-and-rabbit cage problem.

Can correctly solve and demonstrate the reasoning process.

Natural science problem solving

Answering questions related to natural sciences.

Can provide logically coherent answers.

Social sciences

Social science problem solving

Answering questions related to social sciences.

Can provide logically coherent answers.

🚀 GreenMind-Medium-14B-R1

We're excited to release GreenMind-Medium-14B-R1, a medium-sized Vietnamese language model. It's highly effective at answering questions that demand intermediate-level reasoning, covering a wide range of topics like general knowledge, mathematics, natural science, and social science. By using the Group Relative Policy Optimization strategy for fine-tuning, we've guided the model to generate logically coherent responses.

🚀 Quick Start

Here's a code snippet with apply_chat_template to show you how to load the tokenizer and model and generate content.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "GreenNode/GreenMind-Medium-14B-R1"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(
    model_name,
    revision='main',
    trust_remote_code=False,
)
prompt = r"""Vừa gà vừa chó
Bó lại cho tròn
Ba mươi sáu con
Một trăm chân chẵn
Hỏi có bao nhiêu con gà, bao nhiêu con chó?"""

messages = [
    {
        "role": "system",
        "content": "Bạn là một trợ lý ảo hữu ích trong việc trả lời câu hỏi. Hãy suy luận từng bước, và đưa ra đáp án trong thẻ <answer> </answer>."
    },
    {
        "role": "user",
        "content": f"{prompt} Hãy suy luận từng bước trong thẻ <think> </think>. Và trả về đáp án trong thẻ <answer> </answer>."
    },
    {
        "role": "assistant",
        "content": "Hãy để tôi giải quyết từng bước.\n<think>"
    }
]

text = tokenizer.apply_chat_template(
    messages, 
    tokenize=False, 
    continue_final_message=True)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1024
)

generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
# Đầu tiên, chúng ta cần thiết lập hai phương trình dựa trên thông tin đề bài:
# 1. Tổng số con gà và chó là 36: x + y = 36
# 2. Tổng số chân là 100: 2x + 4y = 100
# Trong đó, x là số con gà và y là số con chó.
# Tiếp theo, chúng ta giải hệ phương trình này:
# Từ phương trình thứ nhất, ta có: x = 36 - y
# Thay vào phương trình thứ hai: 2(36 - y) + 4y = 100
# => 72 - 2y + 4y = 100
# => 2y = 28
# => y = 14 (số con chó)
# Thay y = 14 vào phương trình x + y = 36:
# => x = 36 - 14 = 22 (số con gà)
# Vậy, có 22 con gà và 14 con chó.
# </think>
# <answer>Có 22 con gà và 14 con chó.</answer>

✨ Features

Model Description

Property	Details
Model Type	Causal Language Models
Base Model	Qwen/Qwen2.5-14B-Instruct
Parameters	14.7B
Context Length	Full 131,072 tokens and generation 8192 tokens
Language	Vietnamese

📚 Documentation

Evaluation

Table 1. SeaExam Dataset. GreenMind-Medium-14B-R1 compared to the base model and some larger models.

Model	SeaExam-ID	SeaExam-TH	SeaExam-VI	Avg
Meta-Llama-3.1-70B-Instruct	65.8	70.6	72.6	69.7
gemma3-27b-it	64.4	67.5	73.1	68.4
Qwen2.5-14B-Instruct	67.6	68.8	73.1	69.8
GreenMind-Medium-14B-R1	74.36	69.75	74.44	72.79

Table 2. VLSP 2023 Challenge: The performance of our model outperforms most SOTA models.

Model	ComprehensionQA-vi ↑	Exams-vi ↑	LAMBADA-vi ↓	WikiQA-vi ↑	MMLU-vi ↑
cpt-smartbot-13b	0.6633	0.3473	21.9864	0.4455	0.414
ura-llama-13b	0.6556	0.342	17.5614	0.438	0.3973
greennode-7b (prior work)	0.6122	0.2892	189.7782	0.3335	0.387
greennode-14b (prior work)	0.6711	0.3672	29.5967	0.468	0.5281
GreenMind-Medium-14B-R1 (Ours)	0.8689	0.7796	10.7609	0.7915	0.7124

Table 3. VMLU Dataset. The performance compared to fine-tuned models.

Model	Access	STEM	Social Science	Humanities	Others	Avg
VNPTAI.IO-Medium-R1	Private	77.09	82.3	78.85	69.98	77.43
MISA-Llama3-v1.1	Private	77.5	80.75	76.62	71.6	76.87
BnK-AI-Medium-v2	Private	80.94	80.76	70.7	74.06	76.66
VNPTAI.IO-Large-v4	Private	78.05	79.05	75.39	70.37	76.21
GreenNode-xMedium-v1	Private	75.7	81.09	75.25	69.33	75.5
GreenMind-Medium-14B-R1 (Ours)	Weight	76.78	77.36	72.32	69.03	74.29
CakebyVPBank-Large	Private	77.75	78.11	70.38	67.82	73.99
DeepSeek-R1-Distill-Llama-70B	Weight	76.77	76.23	67.98	66.82	72.41

📄 License

This repository and the model weights are licensed under the MIT License.

📖 Citation

If you find our work helpful, feel free to cite us.

@misc{tung2025greenmindnextgenerationvietnameselarge,
      title={GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning}, 
      author={Luu Quy Tung and Hoang Quoc Viet and Vo Trong Thu},
      year={2025},
      eprint={2504.16832},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2504.16832}, 
}

📞 Contact Us

General & Collaboration: tung.vu@greennode.ai, thuvt@greennode.ai
Technical: viethq5@greennode.ai

🔗 Follow Us

https://x.com/greennode23

💬 Support

https://discord.gg/B6MJFM3J3a

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご