ELYZA-Shortcut-1.0-Qwen-32B Open Source Model - Bypass Inference and Directly Present the Final Answer!

ELYZA Shortcut 1.0 Qwen 32B

Developed by elyza

ELYZA-Shortcut-1.0-Qwen-32B is a non-reasoning model developed based on Qwen2.5-32B-Instruct, capable of bypassing reasoning steps to directly generate final answers.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Direct answer generation #Reasoning path optimization #Japanese-English bilingual

Downloads 172

Release Time : 4/30/2025

Model Overview

This model is a non-reasoning derivative developed during the creation of the reasoning model ELYZA-Thinking-1.0-Qwen-32B. It is trained via supervised fine-tuning using question-solution pairs to directly generate final answers.

Model Features

Direct answer generation

Capable of bypassing step-by-step reasoning to directly generate final answers.

Efficient training method

Trained via supervised fine-tuning using question-solution pairs obtained by removing reasoning steps from optimal reasoning paths explored via Monte Carlo tree search algorithms.

Multilingual support

Supports both Japanese and English languages.

Model Capabilities

Text generation

Question answering

Multilingual text processing

Use Cases

Creative generation

Work motivation idea generation

Generate ideas to reignite work motivation

Capable of generating multiple creative ideas

Question answering

Direct question answering

Directly answer user questions

Quickly generates accurate answers

🚀 ELYZA-Shortcut-1.0-Qwen-32B

ELYZA-Shortcut-1.0-Qwen-32B is a non - reasoning model. It is derived during the development of the reasoning model ELYZA-Thinking-1.0-Qwen-32B. Based on Qwen/Qwen2.5-32B-Instruct, this model can directly generate the final answer by bypassing the step - by - step reasoning process after post - training.

ELYZA-Shortcut-1.0-Qwen-32B-image

🚀 Quick Start

This model can be used with the Hugging Face Transformers library. You can follow the steps below to start using it.

✨ Features

Derived Model: It is a non - reasoning model derived from the development of ELYZA-Thinking-1.0-Qwen-32B.
Direct Answer Generation: Based on Qwen/Qwen2.5-32B-Instruct, it can directly generate the final answer without step - by - step reasoning after post - training.
Post - training Method: During post - training, it uses supervised fine - tuning (SFT) with problem - solution pairs obtained by removing reasoning steps from optimal reasoning paths explored through an MCTS - based algorithm.

📦 Installation

To use this model, you need to install the necessary libraries. You can install the transformers library using the following command:

pip install transformers torch

For deployment, you can install vLLM to create an OpenAI - Compatible Server:

pip install vllm

💻 Usage Examples

Basic Usage

The following code shows how to use the model for inference:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "elyza/ELYZA-Shortcut-1.0-Qwen-32B"
prompt = "仕事の熱意を取り戻すためのアイデアを5つ挙げてください。"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
)
model.eval()
messages = [{"role": "user", "content": prompt}]
input_text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
token_ids = tokenizer.encode(
    input_text, add_special_tokens=False, return_tensors="pt"
)
with torch.no_grad():
    output_ids = model.generate(
        token_ids.to(model.device),
        max_new_tokens=8192,
        do_sample=True,
        temperature=0.6,
        top_p=0.95,
    )
output = tokenizer.decode(
    output_ids.tolist()[0][token_ids.size(1):], skip_special_tokens=True
)
print(output)

Advanced Usage

For deployment, vLLM is recommended to create an OpenAI - Compatible Server. The following command shows how to serve the model using vLLM:

vllm serve elyza/ELYZA-Shortcut-1.0-Qwen-32B \
    --tensor-parallel-size 8 \
    --max-model-len 32768

📚 Documentation

Model Description

ELYZA-Shortcut-1.0-Qwen-32B is a non - reasoning model derived during the development of the reasoning model ELYZA-Thinking-1.0-Qwen-32B. Based on Qwen/Qwen2.5-32B-Instruct, this model has been post - trained to bypass the step - by - step reasoning process and directly generate the final answer (Built with Qwen).

During the post - training phase, the model was trained via supervised fine - tuning (SFT) using problem - solution pairs. These pairs were obtained by removing reasoning steps from optimal reasoning paths explored through an Monte Carlo Tree Search (MCTS) based algorithm. For more details, please refer to our blog post.

How to Cite

If you use this model in your work, please cite it as follows:

@misc{elyza2025thinking,
    title={elyza/ELYZA-Thinking-1.0-Qwen-32B},
    url={https://huggingface.co/elyza/ELYZA-Thinking-1.0-Qwen-32B},
    author={Masato Hirakawa and Tomoaki Nakamura and Akira Sasaki and Daisuke Oba and Shoetsu Sato},
    year={2025},
}

Citations

@misc{qwen2.5,
    title = {Qwen2.5: A Party of Foundation Models},
    url = {https://qwenlm.github.io/blog/qwen2.5/},
    author = {Qwen Team},
    month = {September},
    year = {2024}
}

@article{qwen2,
      title={Qwen2 Technical Report},
      author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
      journal={arXiv preprint arXiv:2407.10671},
      year={2024}
}

📄 License

This project is licensed under the apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご