DeepSeek-R1-Distill-Qwen-14B-GRPO-Taiwan-Spirit Open-source Text Generation Model

Deepseek R1 Distill Qwen 14B GRPO Taiwan Spirit

Developed by kartd

This is a fine-tuned version based on the Qwen-14B model, trained using the GRPO method, suitable for text generation tasks.

Downloads 111

Release Time : 6/4/2025

Model Overview

This model is a fine-tuned version based on a specific model, trained using TRL, mainly used for text generation tasks.

GRPO training method

Trained using the GRPO method, which was proposed in the DeepSeekMath paper and optimizes mathematical reasoning ability.

Fine-tuning based on Qwen-14B

Fine-tuned based on the Qwen-14B model, inheriting its powerful text generation ability.

TRL training framework

Trained using the TRL (Transformer Reinforcement Learning) framework, optimizing the model's generation results.

Text generation

Mathematical reasoning

Text generation

Time travel choice

Generate text responses about time travel choices

Generate coherent and logical text responses

Mathematical reasoning

Mathematical problem solving

Solve complex mathematical problems

Generate accurate mathematical reasoning and solutions

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base