🚀 OpenR1-Qwen-7B-Turkish 🚀
This is a finetuned version of Qwen2.5-Instruct on WiroAI/dolphin-r1-turkish. It aims to address some limitations in language reasoning and performance on low - resource languages, offering a contribution to the open - source community.
🚀 Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "OpenR1-Qwen-7B-Turkish"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "$4x+5 = 6x+7$ denklemini sağlayan $x$ değerini bul."
messages = [
{"role": "system", "content": "Lütfen adım adım düşün ve cevapla."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=4096
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
Basic Usage
Advanced Usage
✨ Features
Overview
- DeepSeek's distilled models sometimes reason in Chinese or English even when prompted in another language.
- Open - Source models still need improvement on relatively low - resource languages.
- There is a motivation to reproduce R1 and contribute to the community.
Training
- We trained the model on the WiroAI/dolphin-r1-turkish for 2 epochs. We used a learning rate of 1e - 5 and a max sequence length of 4096. The training followed a cosine learning rate schedule with a 10% warm - up phase.
- Training took 3 days on an 8xA6000 ADA cluster.
- Normally, the R1 team compares the performance of OpenR1 models to DeepSeek - Distill - Qwen - 7B and OpenThinker - 7B using lighteval. However, since the datasets are only MATH - oriented, we won't disclose the default results as no conclusive findings can be made.
You can find the training and evaluation code at: https://github.com/huggingface/open-r1/
📚 Documentation
Evaluation
- We observed that the reasoning process has slightly improved. Our model thinks more clearly in Turkish compared to DeepSeek's reasoning model.
- This model was trained for experimental purposes, and any benchmark evaluation is highly appreciated. Please note that this model will produce more tokens compared to normal models and will consume more VRAM during inference.
- If you are willing to evaluate this model, please ensure that the model is allowed to produce enough tokens. Generate - until requests that restrict the model to output less than 4000 tokens will lead to poor results.
- We believe that democratized and culturally improved open - source models will be achieved through sharing and experiments!
🤗 Community
We would like to thank Huggingface Staff and everyone who contributed to the Open - R1 project!
📄 License
This project is licensed under the Apache 2.0 license.
Citation
@article{WiroAI,
title={WiroAI/OpenR1-Qwen-7B-Turkish},
author={Abdullah Bezir, Cengiz Asmazoğlu},
year={2025},
url={https://huggingface.co/WiroAI/OpenR1-Qwen-7B-Turkish}
}
Information Table
Property |
Details |
Model Type |
Finetuned version of Qwen2.5 - Instruct |
Training Data |
WiroAI/dolphin - r1 - turkish |
Important Note
⚠️ Important Note
This model will produce more tokens compared to normal models and consume more VRAM during inference. When evaluating, make sure the model is allowed to generate enough tokens (at least 4000).
💡 Usage Tip
In complex reasoning tasks, use a larger max_new_tokens
value to allow the model to fully express the reasoning process.