🚀 Pygmalion-3 12B
Our latest roleplaying model, offering immersive role - playing experiences with advanced text generation capabilities.
📚 Documentation
Model Details
It's been a long and arduous journey filled with delays, technical glitches, and countless moments of frustration. But we're thrilled to announce our return to the open - source roleplaying scene with our brand - new model, Pygmalion - 3. We've leveraged Mistral's Nemo base model and fed it hundreds of millions of tokens from conversations, creative writing, and instructions. The result is a model dedicated to roleplaying that we hope will exceed your expectations.
As part of our open - source ethos and our commitment to our long - standing supporters, we're releasing this model under the permissive Apache 2.0 license. This allows anyone in the local models community to use and build upon our work.
Prompting
We've transitioned to the standard ChatML format for both convenience and to facilitate easier merging with other ChatML - based models. Similar to our previous Pygmalion - 2 model, Pygmalion - 3 supports the "Enter X mode". However, we encourage you to experiment with the system prompt to find the optimal settings for your needs.
⚠️ Important Note
Some strange issues have been reported with the <|im_end|> token. It is highly recommended to add a custom token ban on the phrase "<|im_end|>" and "<" in general. We apologize for the inconvenience.
Usage Examples
Basic Usage
<|im_start|>system
Enter roleplay mode. You shall reply to {{user}} while staying in character. Your responses must be detailed, creative, immersive, and drive the scenario forward. You will follow {{char}}'s persona.<|im_end|>
<|im_start|>user
{{user}}: Good evening!<|im_end|>
<|im_start|>assistant
{{char}}: It's three in the morning, man.<|im_end|>
Note that {{user}} and {{char}} are placeholders.
Dataset
We've amassed a vast collection of instructions and roleplaying data, totaling hundreds of millions of tokens. This includes our PIPPA dataset and data from roleplaying forums.
Limitations and biases
The intended use - case for this model is fictional writing for entertainment purposes. Any other usage falls outside the scope of this model.
As such, this model was not fine - tuned to be completely safe and harmless. Both the base model and this fine - tuned version were trained on data that is known to contain profanity, lewd, or offensive texts. It may generate socially unacceptable or undesirable text, even if the prompt itself is not explicitly offensive. Moreover, the outputs may often be factually incorrect or misleading.
Technical Details
We trained our model as a rank - 32 LoRA adapter. We ran one epoch over our data using 8x NVIDIA A40 GPUs. For this training run, we used a learning rate of 2e - 4 and a total batch size across all GPUs of 24. A cosine learning rate scheduler with a 100 - step warmup was employed. We also used DeepSpeed ZeRO to effectively reduce memory usage.
Acknowledgements
This project would not have been possible without the compute support of [Hive Digital Technologies](https://huggingface.co/H - D - T) and the [Axolotl](https://github.com/axolotl - ai - cloud/axolotl) training software.
We'd like to express our sincere gratitude to lemonilia for their invaluable assistance in compiling roleplay forum data.
Most importantly, we dedicate this model to our amazing community, who have stood by us through thick and thin. Thank you so much from the bottom of our hearts. We hope you enjoy our work to the fullest, and rest assured, more is on the way soon.
📄 License
This model is released under the Apache 2.0 license.