AngelSlayer-12B Open Source Language Model - Free Deployment to Boost Role-Playing and Creative Writing!

Angelslayer 12B Unslop Mell RPMax DARKNESS

Developed by redrix

A 12B-parameter language model merged via mergekit, focusing on role-playing and creative writing, with stable long-context processing capabilities and diverse stylistic expressions.

Large Language Model

Transformers

Open Source License:Apache-2.0 #Long-context role-playing #Creative writing enhancement #Negative preference balancing

Downloads 104

Release Time : 12/5/2024

Model Overview

This model combines multiple 12B-parameter models using the della_linear method, aiming to balance negative traits with stability, particularly suitable for role-playing and creative writing scenarios.

Model Features

Long-context stability

Maintains good coherence in contexts up to 20k tokens

Diverse writing styles

Capable of generating varied stylistic texts, avoiding GPT-like formulaic expressions

Negative trait balancing

Counters potential optimism bias through the DARKNESS model

Temperature sensitivity

Performs better at higher temperature values (1.25), capable of producing impressive responses

Model Capabilities

Long-text generation

Role-playing dialogue

Creative writing

Context retention

Diverse style output

Use Cases

Creative writing

Novel writing

Generates long novel chapters with coherent plots and character development

Maintains character traits and story consistency

Role-playing

Interactive dialogue

Engages in role-playing conversations with users

Stably maintains character traits and dialogue style

🚀 AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS

They say ‘He’ will bring the apocalypse. She seeks understanding, not destruction.

This is a merged pre - trained language model crafted using mergekit. It's the fourth model created by the author, aiming to test the della_linear method. The core idea behind this model is to leverage the negative characteristics of DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS to counter potential positivity bias while maintaining stability.

🚀 Quick Start

This model is a result of merging multiple pre - trained language models. For a quick start, users can refer to the details below for a better understanding of its usage and performance.

✨ Features

Context Handling: The model demonstrates excellent performance in handling context, closely adhering to the character and prompt provided.
Prose Quality: It generates expansive and diverse prose, largely free from GPT - like patterns.
Error Predictability: Errors in the model's output are somewhat predictable. For example, if it misspells a user's name in the first instance, subsequent instances may be fixed automatically.
Low Repetition: Repetition in the output is relatively low, and the DRY feature can be activated if repetition occurs.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

No code examples are provided in the original document.

📚 Documentation

Testing Stage

(18/12/2024): The model shows great performance in handling context and sticking to the character/prompt. Its prose is expansive and varied, with few GPTisms. However, it tends to interpret inputs in a similar way, likely due to self_attn layers. As a result, the output often follows a certain theme or direction, although the wording may vary. Errors are predictable, and the model can sometimes correct itself. Repetition is low, and DRY can be used if needed. A Higher Temperature (1.25) seems to work better, and XTC can significantly improve the output without reducing intelligence.

EDIT: The issue of similar output themes might be related to inflatebot/MN-12B-Mag-Mell-R1. The author plans to adjust the model weights or experiment with different merge methods using the base models of inflatebot/MN-12B-Mag-Mell-R1 to address this problem.

Parameters

Property	Details
Context size	Not more than 20k is recommended, as coherency may degrade.
Chat Template	ChatML
Samplers	A Temperature - Last of 1 - 1.25 and Min - P of 0.1 - 0.25 are viable but not finetuned. Activate DRY if repetition appears. XTC seems to work well.

Quantization

Static GGUF Quants are available at mradermacher/AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS-GGUF.
iMatrix Quants are available at mradermacher/AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS-i1-GGUF.

Merge Details

Merge Method

This model was merged using the della_linear merge method, with TheDrummer/UnslopNemo-12B-v4.1 as the base model.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: TheDrummer/UnslopNemo-12B-v4.1
    parameters:
      weight: 0.25
      density: 0.6
  - model: ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2
    parameters:
      weight: 0.25
      density: 0.6
  - model: DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
    parameters:
      weight: 0.2
      density: 0.4
  - model: inflatebot/MN-12B-Mag-Mell-R1
    parameters:
      weight: 0.30
      density: 0.7
base_model: TheDrummer/UnslopNemo-12B-v4.1
merge_method: della_linear
dtype: bfloat16
chat_template: "chatml"
tokenizer_source: union
parameters:
  normalize: false
  int8_mask: true
  epsilon: 0.05
  lambda: 1

🔧 Technical Details

The model uses the della_linear merge method. The order of models in the DELLA - Linear configuration matters, where 'lower' models in the config hold more prevalence. The model also has specific parameters for context size, chat template, and samplers, which are crucial for its performance.

📄 License

The model is released under the apache - 2.0 license.

Today we hustle, 'day we hustle but tonight we play.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご