đ lzlv_70B
A Mythomax/MLewd_13B-style merge of selected 70B models, aiming to combine creativity with intelligence for an enhanced roleplaying and creative work experience.
đ Quick Start
lzlv_70B is a multi-model merge of several LLaMA2 70B finetunes designed for roleplaying and creative work. The intention was to create a model that fuses creativity and intelligence to offer an improved experience. Did it achieve that? Subjectively, it seemed better than each individual model in my tests.
GGUF 4_K_M + 5_K_M can be found here: https://huggingface.co/lizpreciatior/lzlv_70b_fp16_hf/settings
Update 29/10
Thank you to TheBloke for making the whole range of quants for lzlv: https://huggingface.co/TheBloke/lzlv_70B-GGUF
Also recommended: lzlv merged with limarpv3 - check it out here: https://huggingface.co/Doctor-Shotgun/lzlv-limarpv3-l2-70b/tree/main
Thanks for merging the LoRA. I believe it adds a bit more creative flavor to the model.
lzlvV2 is in development. Coming soon(tm).
⨠Features
- Combines the creativity of multiple models with intelligence.
- Retains the instruction - following capabilities of Xwin - 70B while adopting more creativity.
- Handles complex scenarios better than some creative models.
đ Documentation
Procedure
Models Used
- NousResearch/Nous-Hermes-Llama2-70b: A great model for roleplaying, but not the best at following complex instructions.
- Xwin-LM/Xwin-LM-7B-V0.1: Excellent at following instructions and quite creative out of the box, so it was chosen as the base for the merge.
- Doctor-Shotgun/Mythospice-70b: The wildcard among the three. It's a creative, NSFW - oriented model. I discovered it while searching on Hugging Face. No one had released a quantized version, so I did it myself for testing. It fit well as the third component.
A big thank you to the creators of the above models. Note that Mythospice also includes Nous - Hermes, so it's technically present twice in this mix. This is a common practice among those working on 13B models and is unlikely to harm the model.
Merging Process
The merging process was heavily inspired by Undi95's approach in Undi95/MXLewdMini-L2-13B. Specifically, the ratios are:
- Component 1: Merge of Mythospice x Xwin with SLERP gradient [0.25, 0.3, 0.5].
- Component 2: Merge Xwin x Hermes with SLERP gradient [0.4, 0.3, 0.25].
Finally, both Component 1 and Component 2 were merged with SLERP using weight 0.5.
Performance
I tested this model for a few days before publishing. It seems to mostly retain the instruction - following capabilities of Xwin - 70B while adopting a lot of the creativity of the other two models. It handled my more complex scenarios, which creative models usually struggle with, quite well. At the same time, its outputs felt more creative and perhaps a bit more NSFW - inclined than Xwin - 70B. Subjectively, it feels better, but whether it's truly better remains to be tested.
Prompt Format
Vicuna
USER: [Prompt]
ASSISTANT:
đ License
This project is licensed under the cc - by - nc - 2.0 license.