đ OpenCodeReasoning-Nemotron-32B-IOI
OpenCodeReasoning-Nemotron-32B-IOI is a large language model derived from Qwen2.5-32B-Instruct, post-trained for code generation reasoning and supporting a 32K token context length.
đ Quick Start
Prerequisites
- Install the
transformers
library.
- Ensure you have a compatible NVIDIA GPU and relevant CUDA libraries installed.
Running Inference
Here are examples of running inference on coding problems for different programming languages:
For C++ Programs
import transformers
import torch
model_id = "nvidia/OpenCodeReasoning-Nemotron-32B-IOI"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
prompt = """You are a helpful and harmless assistant. You should think step-by-step before responding to the instruction below.
Please use c++ programming language only.
You must use ```cpp for just the final solution code block with the following format:
```cpp
// Your code here
{user}
"""
messages = [
{
"role": "user",
"content": prompt.format(user="Write a program to calculate the sum of the first $N$ fibonacci numbers")},
]
outputs = pipeline(
messages,
max_new_tokens=32768,
)
print(outputs[0]["generated_text"][-1]['content'])
#### For Python Programs
```python
import transformers
import torch
model_id = "nvidia/OpenCodeReasoning-Nemotron-32B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
prompt = """You are a helpful and harmless assistant. You should think step-by-step before responding to the instruction below.
Please use python programming language only.
You must use ```python for just the final solution code block with the following format:
```python
# Your code here
{user}
"""
messages = [
{
"role": "user",
"content": prompt.format(user="Write a program to calculate the sum of the first $N$ fibonacci numbers")},
]
outputs = pipeline(
messages,
max_new_tokens=32768,
)
print(outputs[0]["generated_text"][-1]['content'])
## ⨠Features
- **Based on a Strong Foundation**: Derived from Qwen2.5-32B-Instruct, it inherits the powerful capabilities of the base model.
- **Code Generation Reasoning**: Specifically post-trained for code generation reasoning, enabling better performance in competitive coding scenarios.
- **Long Context Support**: Supports a context length of up to 32K tokens, allowing for more comprehensive input and output.
- **High Performance on NVIDIA Systems**: Designed and optimized to run on NVIDIA GPU-accelerated systems, achieving faster training and inference times.
## đĻ Installation
The model can be easily used through the `transformers` library. You can install it using the following command:
```bash
pip install transformers
đ Documentation
Model Overview
OpenCodeReasoning-Nemotron-32B-IOI is a large language model for code generation reasoning. It is a derivative of Qwen2.5-32B-Instruct and has 32B model parameters.
The following table shows the average results of 64 evaluations on each benchmark:
Model |
Dataset Size Python |
C++ |
LiveCodeBench (pass@1) |
CodeContests (pass@1) |
IOI (Total Score) |
OlympicCoder-7B |
0 |
100K |
40.9 |
10.6 |
127 |
OlympicCoder-32B |
0 |
100K |
57.4 |
18.0 |
153.5 |
QWQ-32B |
- |
- |
61.3 |
20.2 |
175.5 |
OpenCodeReasoning-IOI |
|
|
|
|
|
OCR-Qwen-32B-Instruct |
736K |
356K |
61.5 |
25.5 |
175.5 |
Reproducing Results
Model Architecture
- Architecture Type: Dense decoder-only Transformer model.
- Network Architecture: Qwen-32B-Instruct.
Input
- Input Type(s): Text
- Input Format(s): String
- Input Parameters: One-Dimensional (1D)
- Other Properties Related to Input: Context length up to 32,768 tokens
Output
- Output Type(s): Text
- Output Format: String
- Output Parameters: One-Dimensional (1D)
- Other Properties Related to Output: Context length up to 32,768 tokens
Software Integration
- Runtime Engine: NeMo 2.3.0
- Recommended Hardware Microarchitecture Compatibility: NVIDIA Ampere, NVIDIA Hopper
- Preferred/Supported Operating System(s): Linux
Model Version(s)
- 1.0 (4/25/2025)
- OpenCodeReasoning-Nemotron-7B
- OpenCodeReasoning-Nemotron-14B
- OpenCodeReasoning-Nemotron-32B
- OpenCodeReasoning-Nemotron-32B-IOI
Training and Evaluation Datasets
Training Dataset
The training corpus is the OpenCodeReasoning dataset, which consists of competitive programming questions and DeepSeek-R1 generated responses.
- Data Collection Method: Hybrid: Automated, Human, Synthetic
- Labeling Method: Hybrid: Automated, Human, Synthetic
- Properties: 736k samples from OpenCodeReasoning
Evaluation Dataset
The datasets listed in the relevant section are used for evaluation.
- Data Collection Method: Hybrid: Automated, Human, Synthetic
- Labeling Method: Hybrid: Automated, Human, Synthetic
License/Terms of Use
The use of this model is governed by Apache 2.0.
Deployment Geography
Global
Use Case
This model is intended for developers and researchers building LLMs.
Release Date
Huggingface [04/25/2025] via https://huggingface.co/nvidia/OpenCodeReasoning-Nemotron-32B/
Inference
- Engine: vLLM
- Test Hardware: NVIDIA H100-80GB
đ§ Technical Details
Model Architecture
The model is a dense decoder-only Transformer model based on the Qwen-32B-Instruct architecture. It has 32B model parameters, which provides a strong foundation for code generation reasoning.
Input and Output
The model accepts text input in string format with a one-dimensional input parameter. It supports a context length of up to 32,768 tokens. The output is also text in string format with a one-dimensional output parameter and the same context length support.
Training and Optimization
The model is post-trained on the OpenCodeReasoning dataset for code generation reasoning. It is designed and optimized to run on NVIDIA GPU-accelerated systems, leveraging NVIDIA's hardware and software frameworks to achieve faster training and inference times.
đ License
This model is licensed under the Apache 2.0 license.
Citation
If you find the data useful, please cite:
@article{ahmad2025opencodereasoning,
title={OpenCodeReasoning: Advancing Data Distillation for Competitive Coding},
author={Wasi Uddin Ahmad, Sean Narenthiran, Somshubra Majumdar, Aleksander Ficek, Siddhartha Jain, Jocelyn Huang, Vahid Noroozi, Boris Ginsburg},
year={2025},
eprint={2504.01943},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2504.01943},
}
Additional Information
Ethical Considerations
NVIDIA believes that Trustworthy AI is a shared responsibility. When using this model, developers should work with their internal model team to ensure it meets the requirements of the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns as appropriate.
Model Performance Visualization
