T Pro It 2.0
Model Overview
Model Features
Model Capabilities
Use Cases
🚀 T-pro-it-2.0
T-pro-it-2.0 is a model built upon the Qwen 3 model family. It combines continual pre-training and alignment techniques to offer enhanced performance in various tasks, especially reasoning.
🚀 Quick Start
Before using the T-pro-it-2.0 model, please note the following important information: 🚨 Users are advised to exercise caution and are responsible for any additional training and oversight required to ensure the model's responses meet acceptable ethical and safety standards. The responsibility for incorporating this model into industrial or commercial solutions lies entirely with those who choose to deploy it.
✨ Features
T-pro-it-2.0 is a model built upon the Qwen 3 model family and incorporates both continual pre-training and alignment techniques.
📚 Dataset
- Instruction Pre-Training: 40B tokens of instruction data, with one-third focused on reasoning tasks.
- Supervised Fine-Tuning (SFT): ~500K high-quality and diverse instructions with balanced complexity. Reasoning tasks make up about 20% of the dataset.
- Preference Tuning: ~100K carefully selected instructions, filtered by length and type for general tasks and with domain-balanced selection for reasoning tasks.
📊 Benchmarks
Model | MERA | ruMMLU | Ru Arena Hard | ru AIME 2025 | ru LCB |
---|---|---|---|---|---|
T-pro 2.0 | 0.660 | 0.790 | 0.876 | 0.646 | 0.563 |
Qwen 3 32B | 0.584 | 0.740 | 0.836 | 0.625 | 0.537 |
Ruadapt 3 32B V2 | 0.574 | 0.737 | 0.660 | 0.450 | 0.500 |
DeepSeek-R1-Distill-Qwen-32B | 0.508 | 0.702 | 0.426 | 0.402 | 0.493 |
Gemma 3 27B | 0.577 | 0.695 | 0.759 | 0.231 | 0.261 |
🔧 Technical Details
Switching Between Thinking and Non‑Thinking Modes
To enable or disable reasoning mode in HuggingFace, set the enable_thinking
flag in tokenizer.apply_chat_template
.
For more details, see:
Recommended Generation Parameters
Mode | Temperature | presence_penalty |
---|---|---|
No‑think (general requests) | ≤ 0.3 | 1.0 |
Think mode (standard requests) | ≈ 0.6 | 1.0 |
Complex reasoning requests | ≥ 0.8 | 1.0 |
- Hybrid reasoning models need careful tuning of sampling hyperparameters, which vary by domain.
- Use lower temperature for straightforward queries and higher temperature for complex 'think-mode' tasks.
- A presence_penalty between 0 and 2 can help avoid repetitive outputs.
💻 Usage Examples
Basic Usage
SGLang Usage
For better quality and stable performance, we recommend SGLang as your inference framework.
To run an inference server for T-pro-it-2.0, start by launching the SGLang server:
python -m sglang.launch_server \
--model-path t-tech/T-pro-it-2.0 \
--reasoning-parser qwen3
Once the server is up and listening on localhost:30000
, you can send chat-based requests via the OpenAI Python client.
import openai
client = openai.OpenAI(
base_url="http://127.0.0.1:30000/v1",
api_key="ANY" # the server ignores the API key
)
prompt = (
"Please calculate the definite integral ∫_0^1 x² eˣ dx, "
"explain the solution step by step and indicate the final result."
)
completion = client.chat.completions.create(
model="ANY", # the server ignores the model name
messages=[
{"role": "system", "content": "You are T-pro, a virtual assistant in T-Technology. Your task is to be a helpful dialogue assistant."},
{"role": "user", "content": prompt}
],
# REQUIRED: sampling params from the "Recommended Generation Parameters" table
temperature=0.6,
presence_penalty=1.0,
)
# The generated reply is in `completion.choices[0].message.content`
print(completion.choices[0].message.content)
Note: It is obligatory to include both temperature
and presence_penalty
in every completion call.
HF Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
torch.manual_seed(42)
model_name = "t-tech/T-pro-it-2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto",
)
prompt = (
"Please calculate the definite integral ∫_0^1 x² eˣ dx, "
"explain the solution step by step and indicate the final result."
)
messages = [
{"role": "system", "content": "You are T-pro, a virtual assistant in T-Technology. Your task is to be a helpful dialogue assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
Output:
<think>
Okay, I need to calculate the definite integral from 0 to 1 of the function x² * e^x dx. I remember that for integrals of this type, where there is a product of a polynomial and an exponential, the integration by parts method is usually applied. Let's recall the integration by parts formula: ∫u dv = uv - ∫v du.
First, we need to choose what to take as u and what as dv. Usually, in such cases, the polynomial (in this case x²) is taken as u because when differentiating it, its degree decreases, and eventually it will turn into a constant, which simplifies the integral. Then dv will be the remaining part, that is, e^x dx.
So, let u = x², then du = 2x dx. And dv = e^x dx, so v = ∫e^x dx = e^x.
Now, apply the integration by parts formula: ∫x² e^x dx = uv - ∫v du = x² e^x - ∫e^x * 2x dx.
Now we have an integral ∫2x e^x dx left. We also need to take it by parts. Again, choose u and dv. Now u = 2x, then du = 2 dx, and dv = e^x dx, so v = e^x.
Apply the formula again: ∫2x e^x dx = 2x e^x - ∫e^x * 2 dx = 2x e^x - 2∫e^x dx.
The integral ∫e^x dx is equal to e^x + C. Substitute it back: 2x e^x - 2e^x + C.
Now, return to the original integral: x² e^x - (2x e^x - 2e^x) + C = x² e^x - 2x e^x + 2e^x + C.
Now, we need to calculate the definite integral from 0 to 1. To do this, substitute the upper and lower limits into the antiderivative and subtract.
First, substitute x = 1: (1² * e^1) - 2*1*e^1 + 2*e^1 = e - 2e + 2e = (1 - 2 + 2)e = 1*e = e.
Now, substitute x = 0: (0² * e^0) - 2*0*e^0 + 2*e^0 = 0 - 0 + 2*1 = 2.
Subtract: e - 2.
Let me check if I made a mistake in the calculations. When substituting x = 1: 1*e - 2*1*e + 2*e = e - 2e + 2e = e. Yes, it's correct. When x = 0: 0 - 0 + 2*1 = 2. Then the difference is e - 2.
But wait, maybe I missed a sign somewhere? In the antiderivative after integration by parts: x² e^x - 2x e^x + 2e^x. Yes, it's correct. Then when substituting 1: 1*e - 2*1*e + 2*e = (1 - 2 + 2)e = 1*e. When 0: 0 - 0 + 2*1 = 2. Then the answer is e - 2.
But e is approximately equal to 2.718, so e - 2 ≈ 0.718. Intuitively, it seems that the integral from 0 to 1 of x² e^x should be positive, which is consistent.
Maybe I should check through another method or by differentiating the antiderivative? Let's take the derivative of x² e^x - 2x e^x + 2e^x.
The derivative: d/dx [x² e^x] = 2x e^x + x² e^x.
d/dx [-2x e^x] = -2 e^x - 2x e^x.
d/dx [2e^x] = 2 e^x.
Add them all up: 2x e^x + x² e^x - 2 e^x - 2x e^x + 2 e^x = x² e^x.
Yes, the derivative is equal to the integrand. So, the antiderivative is found correctly. Then the definite integral is equal to e - 2.
It seems everything is correct. Maybe we could have used a recurrence formula or an integral table, but the integration by parts method worked.
So, the final answer: e - 2.
</think>
To calculate the definite integral
$$
\int_0^1 x^2 e^x \, dx
$$
we use the **integration by parts method**.
### Step 1: Applying integration by parts
The integration by parts formula:
$$
\int u \, dv = uv - \int v \, du.
$$
We choose:
- $ u = x^2 $, then $ du = 2x \, dx $,
- $ dv = e^x \, dx $, then $ v = e^x $.
Substitute into the formula:
$$
\int x^2 e^x \, dx = x^2 e^x - \int e^x \cdot 2x \, dx = x^2 e^x - 2 \int x e^x \, dx.
$$
### Step 2: Integrating the remaining integral
Now we calculate $ \int x e^x \, dx $, again applying integration by parts:
- $ u = x $, then $ du = dx $,
- $ dv = e^x \, dx $, then $ v = e^x $.
We get:
$$
\int x e^x \, dx = x e^x - \int e^x \, dx = x e^x - e^x + C.
$$
Substitute it back:
$$
\int x^2 e^x \, dx = x^2 e^x - 2 \left( x e^x - e^x \right) + C = x^2 e^x - 2x e^x + 2 e^x + C.
$$
### Step 3: Calculating the definite integral
Substitute the limits $ 0 $ and $ 1 $:
$$
\left[ x^2 e^x - 2x e^x + 2 e^x \right]_0^1 = \left( 1^2 e^1 - 2 \cdot 1 \cdot e^1 + 2 e^1 \right) - \left( 0^2 e^0 - 2 \cdot 0 \cdot e^0 + 2 e^0 \right).
$$
Simplify:
- When $ x = 1 $:
$$
e - 2e + 2e = e.
$$
- When $ x = 0 $:
$$
0 - 0 + 2 \cdot 1 = 2.
$$
The final result:
$$
e - 2.
$$
### Answer:
$$
\boxed{e - 2}
$$
VLLM Usage
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
model_name = "t-tech/T-pro-it-2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
llm = LLM(model=model_name, max_model_len=8192)
sampling_params = SamplingParams(temperature=0.7,
repetition_penalty=1.05,
top_p=0.8, top_k=70,
max_tokens=512)
prompt = (
"Please calculate the definite integral ∫_0^1 x² eˣ dx, "
"explain the solution step by step and indicate the final result."
)
messages = [
{"role": "system", "content": "You are T-pro, a virtual assistant in T-Technology. Your task is to be a helpful dialogue assistant."},
{"role": "user", "content": prompt}
]
prompt_token_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
outputs = llm.generate(prompt_token_ids=prompt_token_ids, sampling_params=sampling_params)
generated_text = [output.outputs[0].text for output in outputs]
print(generated_text)
Advanced Usage
Long Context Usage
T-pro-it-2.0 natively supports a context length of 32,768 tokens.
For conversations where the input significantly exceeds this limit, follow the recommendations from the Qwen3 model card on processing long texts.
For example, in SGLang, you can enable 128K context support with the following command:
llama-server ... --rope-scaling yarn --rope-scale 4 --yarn-orig-ctx 32768
📄 License
This project is licensed under the Apache-2.0 license.

