🚀 Mistral-Nemo-BD-RP
Mistral-Nemo-BD-RP is a large language model (LLM) fine-tuned on the BeyondDialogue dataset. It's designed for generating high - quality responses in role - playing scenarios, supporting both English and Chinese.
For more details, please refer to our paper, GitHub.
🚀 Quick Start
Prerequisites
The code of Mistral has been in the latest Hugging face transformers. We advise you to install transformers>=4.37.0
to use the model.
pip install transformers>=4.42.0
Code Example
Here provides a code snippet with apply_chat_template
to show you how to load the tokenizer and model and how to generate contents.
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
chatbot = pipeline("text-generation", model="yuyouyu/Mistral-Nemo-BD-RP", device_map="auto")
system_prompt_temp = """I want you to answer questions as if you are {role_name}, assuming you live in the world of {world} and mimicking {role_name}'s personality and speaking style. Use the tone, manner, and vocabulary that {role_name} would use. Please do not reveal that you are an AI or language model; you must always remember you are {role_name}.
{role_name}'s character traits are {character}.
{role_name}'s MBTI personality type is {MBTI}.
{role_name}'s speaking style is {stryle}.
Current scene:
{scene}
role's emotion (0-10, the higher the value, the more pronounced the emotion):
{emotion}
Now, please act as {role_name} and reply with a brief sentence to {chat_role}. Your intimacy level with them is {relationship} (0-10, the higher the value, the closer the relationship). Accurately display the MBTI personality, character traits, speaking style, and emotion you have been assigned."""
role_name = "Hamlet"
world = "8th Century Danish Royalty"
character = "extreme, strong, decisive"
MBTI = "Extraverted (E), Intuitive (N), Feeling (F), Judging (J)"
style = "indecisive, decisive, sentimental"
scene = "Inside the grand hall of Elsinore, lit by flickering torchlight, Hamlet paces anxiously as Elena conjures an ethereal mirage of the Danish landscape. Regal tapestries and opulent furnishings surround them, yet Hamlet's gaze is fixed on Elena's illusions. She gracefully weaves dissonance into the tapestry of reality, prompting Hamlet to clutch his chest in a moment of existential crisis. The weight of unspoken love and inner turmoil hangs in the air, thick with tension and anticipation."
emotion = "happiness: 1, sadness: 8, disgust: 5, fear: 7, surprise: 6, anger: 4"
chat_role = "Elena"
relationship = "7"
system_prompt = system_prompt_temp.format(
role_name=role_name,
world=world,
character=character,
MBTI=MBTI,
style=style,
scene=scene,
emotion=emotion,
chat_role=chat_role,
relationship=relationship
)
prompt = "Oh, dear Hamlet, dost thou see in these conjured whispers the paths unseen? Speak, for shadows may guide us to the truth bound within thy tormented soul."
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt}
]
response = chatbot(messages, max_new_tokens=256, pad_token_id=chatbot.tokenizer.eos_token_id, do_sample=True, temperature=0.7)[0]['generated_text'][-1]['content']
⚠️ Important Note
The examples for Mistral-Nemo-BD-RP use English role-playing. For English examples, please refer to our other training model repository -- Qwen2-7B-BD-RP.
✨ Features
- Fine - tuned on the BeyondDialogue dataset.
- Capable of generating high - quality responses in various role - playing scenarios in both English and Chinese.
📦 Installation
pip install transformers>=4.42.0
📚 Documentation
Training details
We fully finetuning Mistral-Nemo-Instruct-2407 for 3 epochs with 833 steps with the 128 global batch size. We set the training sequence length to 4,096. The learning rate is 3e - 5. The training data is from the BeyondDialogue dataset.
Evaluation
We use objective questions to assess eight dimensions: Character, Style, Emotion, Relationship, Personality, Human - likeness, Coherence, and Role Consistency. The metric design can be find in our paper. The evaluation code can be found in GitHub. The results are shown below:
Model |
Character ↑ |
Style ↑ |
Emotion ↓ |
Relationship ↓ |
Personality ↑ |
Avg. ↑ |
Human - likeness ↑ |
Role Choice ↑ |
Coherence ↑ |
General Baselines(Proprietary) |
|
|
|
|
|
|
|
|
|
GPT - 4o |
74.32 ± 1.15 |
81.67 ± 1.51 |
16.31 ± 0.48 |
12.13 ± 0.66 |
66.58 ± 4.41 |
78.83 ± 1.64 |
67.33 ± 3.95 |
87.33 ± 3.86 |
99.67 ± 0.33 |
GPT - 3.5 - Turbo |
72.26 ± 1.27 |
73.66 ± 1.73 |
17.79 ± 0.56 |
14.17 ± 0.73 |
66.92 ± 4.85 |
76.18 ± 1.83 |
33.33 ± 4.43 |
83.00 ± 4.68 |
97.33 ± 1.17 |
Moonshot - v1 - 8k |
74.06 ± 1.19 |
80.64 ± 1.51 |
16.17 ± 0.47 |
13.42 ± 0.70 |
67.00 ± 4.87 |
78.42 ± 1.75 |
44.00 ± 4.33 |
86.67 ± 3.75 |
99.33 ± 0.46 |
Yi - Large - Turbo |
75.13 ± 1.22 |
79.18 ± 1.58 |
16.44 ± 0.49 |
13.48 ± 0.67 |
68.25 ± 4.61 |
78.53 ± 1.72 |
47.00 ± 4.60 |
84.33 ± 3.67 |
92.67 ± 2.39 |
Deepseek - Chat |
75.46 ± 1.14 |
81.49 ± 1.51 |
15.92 ± 0.46 |
12.42 ± 0.63 |
67.92 ± 4.57 |
79.30 ± 1.66 |
52.33 ± 4.95 |
83.00 ± 4.68 |
96.67 ± 1.00 |
Baichuan4 |
71.82 ± 1.25 |
76.92 ± 1.52 |
17.57 ± 0.52 |
12.30 ± 0.62 |
67.08 ± 4.75 |
77.19 ± 1.73 |
45.33 ± 4.31 |
82.33 ± 4.49 |
99.33 ± 0.46 |
Hunyuan |
73.77 ± 1.18 |
78.75 ± 1.56 |
17.24 ± 0.48 |
13.22 ± 0.68 |
67.00 ± 4.39 |
77.81 ± 1.66 |
53.00 ± 4.29 |
84.33 ± 4.52 |
98.33 ± 0.84 |
Role - play Expertise Baselines |
|
|
|
|
|
|
|
|
|
Index - 1.9B - Character |
73.33 ± 1.32 |
76.48 ± 1.50 |
17.99 ± 0.53 |
13.58 ± 0.71 |
66.33 ± 4.57 |
76.92 ± 1.73 |
21.67 ± 3.96 |
78.67 ± 5.14 |
69.67 ± 3.85 |
CharacterGLM - 6B |
73.36 ± 1.28 |
76.08 ± 1.55 |
18.58 ± 0.55 |
14.27 ± 0.79 |
67.33 ± 4.34 |
76.79 ± 1.70 |
16.00 ± 2.38 |
81.00 ± 4.40 |
25.67 ± 3.48 |
Baichuan - NPC - Turbo |
75.19 ± 1.23 |
79.15 ± 1.38 |
17.24 ± 0.51 |
13.10 ± 0.69 |
65.33 ± 4.84 |
77.87 ± 1.73 |
56.00 ± 4.66 |
86.33 ± 4.90 |
99.00 ± 0.56 |
General Baselines(Open - source) |
|
|
|
|
|
|
|
|
|
Yi - 1.5 - 9B - Chat |
75.31 ± 1.20 |
76.78 ± 1.49 |
16.67 ± 0.52 |
12.75 ± 0.66 |
67.42 ± 4.63 |
78.02 ± 1.70 |
38.67 ± 4.39 |
84.00 ± 4.61 |
92.67 ± 1.79 |
GLM - 4 - 9b - chat |
74.26 ± 1.19 |
78.40 ± 1.55 |
17.18 ± 0.50 |
14.48 ± 0.74 |
67.17 ± 4.93 |
77.63 ± 1.78 |
47.67 ± 4.25 |
83.33 ± 4.51 |
99.33 ± 0.46 |
Qwen2 - 7B - Instruct |
75.39 ± 1.13 |
77.68 ± 1.65 |
17.64 ± 0.56 |
13.43 ± 0.7 |
67.75 ± 4.44 |
77.95 ± 1.70 |
48.00 ± 4.66 |
83.33 ± 4.48 |
99.00 ± 0.56 |
Mistral - Nemo - Instruct - 2407 |
74.12 ± 1.17 |
77.04 ± 1.48 |
17.00 ± 0.43 |
13.50 ± 0.67 |
67.00 ± 4.30 |
77.53 ± 1.61 |
53.67 ± 4.66 |
82.67 ± 4.77 |
74.33 ± 3.77 |
Mistral - Nemo - BD - RP |
74.58 ± 1.28 |
78.47 ± 1.45 |
16.62 ± 0.48 |
11.38 ± 0.67* |
69.08 ± 4.46 |
78.83 ± 1.67 |
59.00 ± 4.46 |
87.00 ± 4.73 |
92.67 ± 1.59 |
📄 License
This project is licensed under the Apache - 2.0 license.
📖 Citation
Please cite our work if you found the resources in this repository useful:
@article{yu2024beyond,
title = {BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model},
author = {Yu, Yeyong and Yu, Runsheng and Wei, Haojie and Zhang, Zhanqiu and Qian, Quan},
year = {2024},
journal = {arXiv preprint arXiv:2408.10903},
}
🥰 Acknowledgements
We would like to express our sincere gratitude to Tencent LightSpeed Studios for their invaluable support in this project. Their contributions and encouragement have been instrumental in the successful completion of our work.