🚀 Qwen2.5-0.5B-Instruct-Gensyn-Swarm-feathered_giant_ostrich
This model is a fine - tuned version of Gensyn/Qwen2.5-0.5B-Instruct, trained using TRL. It offers enhanced performance in text - generation tasks.
🚀 Quick Start
Basic Usage
from transformers import pipeline
question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="chinna6/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-feathered_giant_ostrich", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])
✨ Features
- Fine - Tuned: Based on Gensyn/Qwen2.5-0.5B-Instruct, it has been fine - tuned to improve performance.
- Trained with TRL: Utilizes TRL for training, enabling reinforcement learning techniques.
📦 Installation
This model uses the transformers
library. You can install it using the following command:
pip install transformers
📚 Documentation
Training procedure
This model was trained with GRPO, a method introduced in DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models.
Framework versions
Property |
Details |
TRL |
0.15.2 |
Transformers |
4.48.2 |
Pytorch |
2.5.1 |
Datasets |
3.6.0 |
Tokenizers |
0.21.1 |
📄 License
This model is under the license
license.
📚 Citations
Cite GRPO as:
@article{zhihong2024deepseekmath,
title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
year = 2024,
eprint = {arXiv:2402.03300},
}
Cite TRL as:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}