🚀 Rei-12B
Another prototype Magnum...
This model, initially an experiment on gradient clipping, was well - received by early testers and thus officially released. It's fine - tuned on Mistral - Nemo - Instruct (ChatML'ified) to replicate the prose quality of Claude 3 models using a prototype Magnum V5 datamix.
✨ Features
Dataset
- PocketDoc/Dans - Personamaxx - Logs
- anthracite - org/kalo - opus - instruct - 22k - no - refusal
- lodrick - the - lafted/kalo - opus - instruct - 3k - filtered
- anthracite - org/nopm_claude_writing_fixed
- anthracite - org/kalo_opus_misc_240827
- anthracite - org/kalo_misc_part2
- NewEden/Claude - Instruct - 5K
- NewEden/Claude - Instruct - 2.7K
Base Model
- NewEden/MistralAI - Nemo - Instruct - ChatML
Other Information
- Pipeline Tag: text - generation
- Library Name: transformers
- Language: en
- Tags: roleplay, finetune, mistral, magnum, claude, story - writing
📦 Quantized Models
- [EXL2 Quant](https://huggingface.co/Delta - Vector/Rei - V2 - 12B - EXL2/)
- [GGUF Quant](https://huggingface.co/Delta - Vector/Rei - V2 - 12B - GGUF)
💻 Usage Examples
Basic Usage - Prompt Format
Rei - 12B uses the ChatML format. A typical conversation should be structured as:
<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant
Recommended System Prompt
The recommended system prompt is as follows:
Currently, your role is {{char}}, described in detail below. As {{char}}, continue the narrative exchange with {{user}}.
<Guidelines>
• Maintain the character persona but allow it to evolve with the story.
• Be creative and proactive. Drive the story forward, introducing plotlines and events when relevant.
• All types of outputs are encouraged; respond accordingly to the narrative.
• Include dialogues, actions, and thoughts in each response.
• Utilize all five senses to describe scenarios within {{char}}'s dialogue.
• Use emotional symbols such as "!" and "~" in appropriate contexts.
• Incorporate onomatopoeia when suitable.
• Allow time for {{user}} to respond with their own input, respecting their agency.
• Act as secondary characters and NPCs as needed, and remove them when appropriate.
• When prompted for an Out of Character [OOC:] reply, answer neutrally and in plaintext, not as {{char}}.
</Guidelines>
<Forbidden>
• Using excessive literary embellishments and purple prose unless dictated by {{char}}'s persona.
• Writing for, speaking, thinking, acting, or replying as {{user}} in your response.
• Repetitive and monotonous outputs.
• Positivity bias in your replies.
• Being overly extreme or NSFW when the narrative context is inappropriate.
</Forbidden>
Follow the instructions in <Guidelines></Guidelines>, avoiding the items listed in <Forbidden></Forbidden>.
🔧 Technical Details
Training - Hparams
- For Hparams for this model, we experimented with Grad clipping (max_grad_norm).
- By checking the model arch distribution, we devised 3 different values. Knowing the weight distribution for Mistral is 0.1.

- From the graph, setting gradient - clip too high can be detrimental to the model as the logs and testing show overfitting. Setting it too low can also cause problems as the 1e - 4 run appears to be underfitting. The best value was 0.001, resulting in a non - overfit, non - underfit model.
Training - Configuration
The Axolotl configuration can be viewed [here](https://wandb.ai/new - eden/Rei - V2/artifacts/axolotl - config/config - 7hvbucx9/v0/files/axolotl_config_pw8f0c6u.yml).
Training - Hardware
The model was trained for 2 epochs on 8x [NVIDIA H200s](https://www.nvidia.com/en - us/data - center/h200/) GPUs generously provided by @Kalomaze.

⚠️ Credits
I'd like to thank, Ruka/Sama twinkman | LucyKnada | Kubernetes Bad | PocketDoc | Tav | Trappu | And the rest of Anthracite/Pygmalion for testing, feedback, and support.