QwQ-32B-ArliAI-RpR-v4 Open Source Model - Empowering Role-Playing and Creative Writing, Reducing Repeated Creative Proposals

Qwq 32B ArliAI RpR V4

Developed by ArliAI

QwQ-32B-ArliAI-RpR-v4 is a high-performance role-play and creative writing model developed by ArliAI, fine-tuned based on QwQ-32B, with a focus on reducing repetition and enhancing creative output.

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #Long-sequence reasoning #Role-play optimization #Creative writing

Downloads 240

Release Time : 5/22/2025

Model Overview

This model is the third release in the RpR series, specifically designed for long multi-turn role-play dialogues. It employs reasoning techniques to maintain response coherence, making it suitable for creative writing and role-play scenarios.

Model Features

Reduced Repetition and Mimicry

Utilizes advanced filtering methods to minimize the language model's tendency to repeat similar phrases or speak on behalf of users, thereby enhancing output diversity.

Long-sequence Training

Training sequence length extended to 16K, improving awareness and memory retention in longer dialogues.

Optimized Reasoning Capabilities

By reprocessing the dataset into a reasoning format, ensures the model maintains coherent and engaging responses in long conversations.

Creative Writing Style

Based on the RPMax dataset organization method, ensures high creativity while minimizing cross-context repetition.

Model Capabilities

Long-text generation

Role-play dialogue

Creative writing

Multi-turn dialogue reasoning

Use Cases

Role-play

Multi-turn Role Interaction

Maintains character consistency and plot coherence in long dialogues

Generates creative character responses, avoiding repetitive patterns

Creative writing

Story Generation

Assists authors in generating unique story segments and plot developments

Produces diverse creative content with reduced cross-context repetition

🚀 QwQ-32B-ArliAI-RpR-v4

The best role - playing and creative writing model from ArliAI, featuring reduced repetitions, increased training sequence length, and high - quality reasoning capabilities.

🚀 Quick Start

You can access the model at https://arliai.com and we also have a models ranking page at [https://www.arliai.com/models - ranking](https://www.arliai.com/models - ranking). Ask questions in our new Discord Server https://discord.com/invite/t75KbPgwhk or on our subreddit https://www.reddit.com/r/ArliAI/.

✨ Features

RpR v4 Changes

Reduced repetitions and impersonation: To enhance the creativity of RpR v3, a more advanced filtering method was employed to eliminate instances where the LLM repeated similar phrases or spoke on behalf of the user. Any remaining repetition or impersonation is due to the training of the base QwQ model, not the RpR dataset.
Increased training sequence length: The training sequence length was extended to 16K to improve awareness and memory, even during longer chats.

RpR Series Overview

RpR (RolePlay with Reasoning) is a new model series from ArliAI, building directly on the successful dataset curation and training methods of the RPMax series. These models use a curated, deduplicated RP and creative writing dataset, emphasizing variety to ensure high creativity and minimize cross - context repetition.

📚 Documentation

RpR Series: Building on RPMax with Reasoning

The RpR series uses the same curated dataset as RPMax, focusing on variety. The release of QwQ, an easily trainable high - performing open - source reasoning model, revealed limitations in existing instruct and creative writing reasoning datasets. To address this, Arli AI created a reasoning RP dataset by re - processing the RPMax dataset. The training process was carefully designed to ensure the model is trained in a way that mimics its inference usage, resulting in coherent and engaging outputs in long multi - turn RP chats.

Model Description

QwQ - 32B - ArliAI - RpR - v4 is the third release in the RpR series. It is a 32 - billion parameter model fine - tuned using the RpR dataset, which combines the RPMax dataset with techniques to maintain reasoning abilities in long multi - turn chats.

Recommended Samplers

RpR models do not work well with repetition penalty samplers, even advanced ones like XTC or DRY.
They perform best with simple sampler settings and a high max tokens value to allow for reasoning.
You can also download the ST master export uploaded in the files section of this repo. Recommended starting parameters:
Temperature: 1.0
MinP: 0.02
TopK: 40
Response Tokens: 2048+

Specs

Property	Details
Base Model	QwQ - 32B
Max Context Length	Max 128K with Yarn (Natively 32K like the base QwQ)
Parameters	32B
Reasoning Model	Yes

Training Details

Sequence Length: 16384
Epochs: 1 epoch training (Inherited from RPMax methods)
Fine - tuning Method: RS - QLORA+ (Rank - Stabilized LoRA + LoRA Plus 8x)
Rank/Alpha: 128 - rank 128 - alpha
Learning Rate: 0.00001
Scheduler: Rex
Gradient accumulation: 32

Quantization

BF16: [https://huggingface.co/ArliAI/QwQ - 32B - ArliAI - RpR - v4](https://huggingface.co/ArliAI/QwQ - 32B - ArliAI - RpR - v4)
GGUF: [https://huggingface.co/ArliAI/QwQ - 32B - ArliAI - RpR - v4 - GGUF](https://huggingface.co/ArliAI/QwQ - 32B - ArliAI - RpR - v4 - GGUF)

How to use reasoning models correctly in ST

For reasoning models in ST:

Set the prefix to ONLY <think> and the suffix to ONLY </think> without spaces or newlines.
Ensure the reply starts with <think>.
Uncheck "Always add character names".
Set "Include names" to "never".
The chat template should conform to the model being used.

Note: Reasoning models work correctly only when "Include names" is set to "never". Otherwise, the model may be confused about whether to respond or reason first.

If you don't see the reasoning wrapped in the thinking block, check your settings or update your ST version. If the whole response is in the reasoning block, check for extra spaces or newlines in the <think> and </think> tokens.

The RPMax Foundation (Dataset & Training Philosophy)

The Goal: Reduced Repetition and Higher Creativity

The dataset curation for RPMax and RpR aims to reduce repetition and enhance the model's creative writing ability across different situations.

What is repetition and creativity?

Creativity: Refers to the variety of outputs the model can generate, not just pleasant prose.
Repetition:
- In - context repetition: Repeating phrases in a single conversation. RPMax and RpR do not yet focus on eliminating this type of repetition.
- Cross - context repetition: Repeating phrases or tropes in different situations. This is always bad and is the main target of the dataset curation.

Dataset Curation

The dataset for RPMax and RpR is curated from open - source creative writing and RP datasets on Hugging Face. Synthetic datasets are removed, and Llama 3.1 8B is used to de - dupe the dataset.

The Golden Rule of Fine - Tuning

For fine - tuning, quality is more important than quantity. The curated dataset is smaller but results in a more unique model.

Training Parameters and Unconventional Approach

The RPMax and RpR methodology uses one epoch, low gradient accumulation, and a higher - than - normal learning rate. The loss curve is unstable but decreasing over time, allowing the model to learn from each example without over - fitting.

📄 License

This project is licensed under the Apache 2.0 license.

![Model Thumbnail](https://cdn - uploads.huggingface.co/production/uploads/6625f4a8a8d1362ebcc3851a/hIZ2ZcaDyfYLT9Yd4pfOs.jpeg) Image generated using Arli AI Image Generation [https://www.arliai.com/image - generation](https://www.arliai.com/image - generation)

![Train Loss](https://cdn - uploads.huggingface.co/production/uploads/6625f4a8a8d1362ebcc3851a/J - cD7mjdIG58BsSPpuS6x.png) ![Eval Loss](https://cdn - uploads.huggingface.co/production/uploads/6625f4a8a8d1362ebcc3851a/T890dqrUcBYnlOzK7MXrU.png) ![RpR ST Settings](https://cdn - uploads.huggingface.co/production/uploads/6625f4a8a8d1362ebcc3851a/njVt2Vir8Isd3ApjTBmoI.png) ![RpR example response](https://cdn - uploads.huggingface.co/production/uploads/6625f4a8a8d1362ebcc3851a/wFQC8Df9dLaiQGnIg_iEo.png)

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご