D

Deepseek R1 Distill Qwen 14B GRPO Taiwan Spirit

Developed by kartd
This is a fine-tuned version based on the Qwen-14B model, trained using the GRPO method, suitable for text generation tasks.
Downloads 111
Release Time : 6/4/2025

Model Overview

This model is a fine-tuned version based on a specific model, trained using TRL, mainly used for text generation tasks.

Model Features

GRPO training method
Trained using the GRPO method, which was proposed in the DeepSeekMath paper and optimizes mathematical reasoning ability.
Fine-tuning based on Qwen-14B
Fine-tuned based on the Qwen-14B model, inheriting its powerful text generation ability.
TRL training framework
Trained using the TRL (Transformer Reinforcement Learning) framework, optimizing the model's generation results.

Model Capabilities

Text generation
Mathematical reasoning

Use Cases

Text generation
Time travel choice
Generate text responses about time travel choices
Generate coherent and logical text responses
Mathematical reasoning
Mathematical problem solving
Solve complex mathematical problems
Generate accurate mathematical reasoning and solutions
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase