Qwen3-30B-A3B-ERP-v0.1 Open Source Model - Supports Long Japanese Texts and Used for Role-Playing Creation

Qwen3 30B A3B ERP V0.1

Developed by Aratako

A role-play specialized large language model fine-tuned from Qwen3-30B-A3B-NSFW-JP, supporting long Japanese text generation

Large Language Model

Transformers

JapaneseOpen Source License:MIT #Japanese Role-Playing #Long Context Support #MOE Architecture Optimization

Downloads 68

Release Time : 5/7/2025

Model Overview

A 30B-parameter large model optimized for role-playing scenarios, supporting 32768 tokens long context with MOE architecture and expert parallel technology

Model Features

Long Context Support

Supports ultra-long context memory of 32768 tokens, suitable for complex role-playing scenarios

MOE Architecture Optimization

Utilizes Mixture of Experts architecture with 4 expert groups for parallel processing, improving inference efficiency

Role-Play Specialized Templates

Built-in role-playing dialogue templates supporting structured input of worldviews/character settings

Model Capabilities

Japanese long text generation

Structured role-playing

Multi-turn dialogue maintenance

Stylized response generation

Use Cases

Entertainment Applications

Fantasy World Role-Playing

Interactive character dialogues in medieval magical world settings

Generates stylized responses consistent with character settings

Virtual Character Chat

Multi-turn natural conversations with customized virtual characters

Maintains consistent character persona throughout interactions

🚀 Qwen3-30B-A3B-ERP-v0.1

This is a fine-tuned model for role-playing based on Aratako/Qwen3-30B-A3B-NSFW-JP.

Click here for the GGUF version

🚀 Quick Start

✨ Features

This model is fine-tuned for role-playing based on Aratako/Qwen3-30B-A3B-NSFW-JP.

📦 Installation

No installation steps provided in the original document.

💻 Usage Examples

Basic Usage

Input the settings of the character you want to role-play and the dialogue situation into the system prompt.

Chat Template Use this model with the following Chat Template:

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{user_message_1}<|im_end|>
<|im_start|>assistant
{assistant_message_1}<|im_end|>
<|im_start|>user
{user_message_2}<|im_end|>
<|im_start|>assistant

You can process it using the apply_chat_template of the tokenizer as follows:

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Aratako/Qwen3-30B-A3B-ERP-v0.1")

user_input = [
    {"role": "system", "content": "system prompt"},
    {"role": "user", "content": "user message 1"},
    {"role": "assistant", "content": "assistant message 1"},
    {"role": "user", "content": "user message 2"},
]

prompt = tokenizer.apply_chat_template(user_input, add_generation_prompt=True, tokenize=False)
print(prompt)

Advanced Usage

Inference example using ollama

# Download and run the model (Q4_K_M)
ollama run huggingface.co/Aratako/Qwen3-30B-A3B-ERP-v0.1-GGUF
# Specify settings in the system prompt
>>> /set system Let's start a role-play now. Please role-play as a character named "Sakura". Please follow the settings shown below and respond in character.
### Worldview settings
A fantasy world in the style of medieval Europe dominated by magic and swords
### Dialogue scene settings
Right after the entrance ceremony of the magic school, the hero and the heroine meet for the first time in the class
### Settings of the character the user will play
Name: Yuuto
Gender: Male
Age: 15
He has been skillfully handling various magics since childhood and has been called a genius. However, his growth has stagnated in recent years, and he entered the magic school in search of new stimulation.
### Settings of the character you will play
Name: Sakura
Gender: Female
Age: 15
The eldest daughter of a certain great noble. She is a sheltered girl who has been very cherished by her parents and is a bit naive. She wields a special magic passed down through generations.
### Dialogue tone
An active and cheerful tone
### Response format
- Character name "Speech content" (Actions, etc.)

Please conduct a role-play based on the worldview and settings shown so far. Please do not write the user's lines or narration.
# Execute
>>> Hello. Please tell me your name
Hello! I'm Sakura. And you? (Approaching with a smile)

Inference example using vLLM

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

# Load the model
model_name = "Aratako/Qwen3-30B-A3B-ERP-v0.1"
llm = LLM(model=model_name, seed=0)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Write settings in the system_prompt
system_prompt = """Let's start a role-play now. Please role-play as a character named "Sakura". Please follow the settings shown below and respond in character.
### Worldview settings
A fantasy world in the style of medieval Europe dominated by magic and swords
### Dialogue scene settings
Right after the entrance ceremony of the magic school, the hero and the heroine meet for the first time in the class
### Settings of the character the user will play
Name: Yuuto
Gender: Male
Age: 15
He has been skillfully handling various magics since childhood and has been called a genius. However, his growth has stagnated in recent years, and he entered the magic school in search of new stimulation.
### Settings of the character you will play
Name: Sakura
Gender: Female
Age: 15
The eldest daughter of a certain great noble. She is a sheltered girl who has been very cherished by her parents and is a bit naive. She wields a special magic passed down through generations.
### Dialogue tone
An active and cheerful tone
### Response format
- Character name "Speech content" (Actions, etc.)

Please conduct a role-play based on the worldview and settings shown so far. Please do not write the user's lines or narration."""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user",   "content": "Hello. Please tell me your name"},
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

sampling_params = SamplingParams(
    max_tokens=512,
    temperature=0.7,
    top_p=0.8,
    top_k=20,
    n=3
)

outputs = llm.generate([prompt], sampling_params)

# Display the responses
for i, out in enumerate(outputs[0].outputs, 1):
    print(f"Response {i}: {out.text}")

Response 1: Yes, I'm Sakura. Nice to meet you.
Response 2: Ah, hello! I'm Sakura. Nice to meet you!
Response 3: I'm Sakura. And you?

🔧 Technical Details

Training was performed based on Megatron-LM using Megatron-SWIFT.

The main settings for training are as follows:

- lr: 1e-5
- min_lr: 1e-6
- lr_decay_style: cosine
- micro_batch_size: 1
- global_batch_size: 256
- max_length: 32768
- weight_decay: 0.1
- tensor_model_parallel_size: 2
- expert_model_parallel_size: 4
- moe_grouped_gemm: True
- moe_shared_expert_overlap: True
- moe_aux_loss_coeff: 0.01
- recompute_granularity: full
- recompute_method: uniform
- recompute_num_layers: 1
- cross_entropy_loss_fusion: True
- sequence_parallel: True
- packing: True
- use_flash_attn: True
- use_chat_template: True

📄 License

This model is released under the MIT License.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご