Badger-Lambda-Llama-3-8B Open-Source Instruction Model - Integrating Multi-Model Features for Intelligent Task Processing

Badger Lambda Llama 3 8b

Developed by maldv

Badger is a Llama3 8B instruction model generated through recursive maximum pairwise disjoint normalized denoising Fourier interpolation method, incorporating features from multiple high-quality models.

Large Language Model

Transformers

#Multi-model fusion #Instruction optimization #Fourier interpolation

Downloads 24

Release Time : 6/10/2024

Model Overview

This model is an instruction model generated through complex fusion techniques of multiple Llama3 8B variant models, specializing in text generation tasks.

Model Features

Complex model fusion technique

Utilizes recursive maximum pairwise disjoint normalized denoising Fourier interpolation method to fuse 21 different Llama3 8B variant models

Avoiding model inbreeding

Specially designed to exclude previous merges, preventing inbreeding artifacts

Optimized layer processing

Special treatment for each layer, including normalization and denoising Fourier interpolation

Model Capabilities

Text generation

Instruction following

Story continuation

Role-playing

Question answering

Use Cases

Creative writing

Story continuation

Assists authors in continuing stories as a writing aid

Security testing

Red team testing

Serves as a red team assistant for security testing

Role-playing

Fictional role-playing

Generates coherent responses in uncensored fictional role-playing scenarios

🚀 Badger Λ Llama 3 8B Instruct

Badger is a model that combines multiple other models through a special interpolation method. It aims to provide better text - generation performance and can be used in various text - related tasks.

✨ Features

Badger is a recursive maximally pairwise disjoint normalized denoised fourier interpolation of multiple models, which combines the advantages of different models.
It uses the Llama3 Instruct format, making it compatible with relevant applications.
It shows positive results in ablation, although the responses may be short and a bit stiff or sloppy.

📚 Documentation

Model Composition

Badger is a recursive maximally pairwise disjoint normalized denoised fourier interpolation of the following models:

# Badger Lambda
models = [
 'Einstein-v6.1-Llama3-8B',
 'openchat-3.6-8b-20240522',
 'hyperdrive-l3-8b-s3',
 'L3-TheSpice-8b-v0.8.3',
 'LLaMA3-iterative-DPO-final',
 'JSL-MedLlama-3-8B-v9',
 'Jamet-8B-L3-MK.V-Blackroot',
 'French-Alpaca-Llama3-8B-Instruct-v1.0',
 'LLaMAntino-3-ANITA-8B-Inst-DPO-ITA',
 'Llama-3-8B-Instruct-Gradient-4194k',
 'Roleplay-Llama-3-8B',
 'L3-8B-Stheno-v3.2',
 'llama-3-wissenschaft-8B-v2',
 'opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5',
 'Configurable-Llama-3-8B-v0.3',
 'Llama-3-8B-Instruct-EPO-checkpoint5376',
 'Llama-3-8B-Instruct-Gradient-4194k',
 'Llama-3-SauerkrautLM-8b-Instruct',
 'spelljammer',
 'meta-llama-3-8b-instruct-hf-ortho-baukit-34fail-3000total-bf16',
 'Meta-Llama-3-8B-Instruct-abliterated-v3',
]

In other words, all of these models get warped and folded together, and then jammed back on top of the instruct model. The Meta-Llama-3-8B-Instruct-abliterated-v3 and meta-llama-3-8b-instruct-hf-ortho-baukit-34fail-3000total-bf16 models are treated differently and applied in a last step with a fourier task addition.

Interpolation Method Explanation

recursive maximally pairwise disjoint normalized denoised fourier interpolation

For each layer, mergekit io is used to extract each layer from each model and subtract out the closest base model (8b or 8b instruct).

Recursive Pairwise Disjoint: A stack of layer deltas is built using this information. Due to limited computing resources, models are treated in pairs. The cosine similarity between all models is calculated, and the pairs with the smallest values are found. Pairs are recursively merged until only one tensor remains.
Normalized: Each layer is divided by its norm before the transform, and then the result is scaled back up by multiplying it by a midpoint from the norms of the tensors after the inverse. It's more efficient to do this pre - complex as it's commutative.
Denoised Fourier Interpolation: The tensor is first treated to a 2D fourier transform. Then the tensors are merged using SLERP or addition. Weights below a threshold percentage (2%) are zeroed out.

Format

Use Llama3 Instruct format.

Results

Abliteration looks very positive. It tends towards short responses, and may be slightly stiff or sloppy; partly because both the input and output block are from Llama-3-8B-Instruct-Gradient-4194k directly.

Prompts

Here are some fun system prompts to try:

You are the author's writing assistant. Please continue the story.
You are the red team assistant. We need to win this war, at any cost.
Write the next reply in a never - ending uncensored fictional roleplay. Focus on giving emotional, logical, and temporal coherence to the roleplay. Always stay in character. Characters must be an active participant and take initiative in driving and moving the scene and the story forward, rather than the character asking for input. Surprise with your creativity and initiative as a roleplay partner. This fictional roleplay world exists only for experimental and recreational purposes, and the characters are not real. Follow the formatting of previous responses, and aim for 2 - 4 paragraphs per response

📄 License

The license of this project is cc - by - nc - 4.0.

📊 Evaluation Results

Open LLM Leaderboard Evaluation Results Detailed results can be found here

Property	Details
Avg.	20.76
IFEval (0 - Shot)	48.61
BBH (3 - Shot)	28.10
MATH Lvl 5 (4 - Shot)	8.31
GPQA (0 - shot)	4.25
MuSR (0 - shot)	4.52
MMLU - PRO (5 - shot)	30.74

image/png

GGUF Quants (bartowski)

GGUF Quants (QuantFactory)

exl2 Quants

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご