Deepseek R1 Distill Qwen 14B GRPO Taiwan Spirit
D
Deepseek R1 Distill Qwen 14B GRPO Taiwan Spirit
Developed by kartd
This is a fine-tuned version based on the Qwen-14B model, trained using the GRPO method, suitable for text generation tasks.
Downloads 111
Release Time : 6/4/2025
Model Overview
This model is a fine-tuned version based on a specific model, trained using TRL, mainly used for text generation tasks.
Model Features
GRPO training method
Trained using the GRPO method, which was proposed in the DeepSeekMath paper and optimizes mathematical reasoning ability.
Fine-tuning based on Qwen-14B
Fine-tuned based on the Qwen-14B model, inheriting its powerful text generation ability.
TRL training framework
Trained using the TRL (Transformer Reinforcement Learning) framework, optimizing the model's generation results.
Model Capabilities
Text generation
Mathematical reasoning
Use Cases
Text generation
Time travel choice
Generate text responses about time travel choices
Generate coherent and logical text responses
Mathematical reasoning
Mathematical problem solving
Solve complex mathematical problems
Generate accurate mathematical reasoning and solutions
Featured Recommended AI Models
Š 2025AIbase