🚀 patricide-12B-Unslop-Mell
The sins of the Father shan't ever be repeated this way.

This project is a merge of pre - trained language models created using mergekit. It aims to combine the advantages of different models to create a new model with better performance.
Key Information
Property |
Details |
Base Model |
inflatebot/MN-12B-Mag-Mell-R1, TheDrummer/UnslopNemo-12B-v4.1 |
Library Name |
transformers |
Tags |
mergekit, merge, 12b, chat, roleplay, creative-writing, SLERP |
License |
apache-2.0 |
New Version |
redrix/patricide-12B-Unslop-Mell-v2 |
🚀 Quick Start
This is my first attempt at merging models. Initially, I had no idea how to write the parameters in the config, but I've since figured it out. If anyone has more in - depth guides on merging models, please share them with me. I'm also eager to understand the underlying science.
Both of the base models produced satisfactory results, so I decided to merge them, hoping that the new model would inherit their good traits. Early testing showed that the model has good coherency, but it sometimes generates unintelligible gibberish or made - up words, likely due to a broken tokenizer.
I've tested this model on the Q6_K GGUF Quant, and the results were satisfactory. So, I decided to upload it. Although I haven't extensively tested it in storywriting or role - playing, the results were stable and at least coherent. I conducted the test with a Temperature of 1 (Temperature last) and Min - P of 0.1. I'm not sure about the effects of DRY or XTC on the output stability, or how it performs with high context sizes. Both parent models use the ChatML Template, and Unslop - Nemo also uses Metharme/Pygmalion. I haven't tested which one works better yet. (Update: Mergekit now has a feature to define the template. In my next models, I'll force the use of ChatML to ensure a standard.)
Feel free to experiment, as I'm still in the experimental stage myself.
Update: I'll likely release my next models once I can run them without excessive fine - tuning of samplers, parameters, text templates, etc. After that, I'll conduct extensive testing following DavidAU's approach to gain more insights while working on new models. I aim to create models that perform well in their base states, with samplers used for further optimization. So, I won't spend too much time fine - tuning samplers unless the base state of the model shows great promise.
📦 Quantization
🔧 Merge Details
Merge Method
This model was merged using the SLERP merge method.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
models:
- model: TheDrummer/UnslopNemo-12B-v4.1
- model: inflatebot/MN-12B-Mag-Mell-R1
merge_method: slerp
base_model: TheDrummer/UnslopNemo-12B-v4.1
dtype: bfloat16
parameters:
t: [0, 0.5, 1, 0.5, 0]
I made the cover art myself in Photoshop... I don't use AI for stuff like that.