### BoreanGale-70B Open-Source Large Language Model - Free Use of Custom Algorithm Fusion Results

Boreangale 70B

Developed by alchemonaut

BoreanGale-70B is a 70B-parameter large language model merged using a custom algorithm (NearSwap), combining the miqu-1-70b-sf and WinterGoddess-1.4x-70B-L2 models.

Large Language Model

Transformers

Open Source License:Other #70B Large Model #Multi-task Text Generation #High-precision Reasoning

Downloads 17

Release Time : 2/2/2024

Model Overview

BoreanGale-70B is a high-performance large language model focused on text generation tasks, excelling in multiple benchmarks.

Model Features

High-performance Text Generation

Excels in multiple benchmarks including AI2 Reasoning Challenge, HellaSwag, MMLU, etc.

Model Merging Technique

Uses a custom NearSwap algorithm to merge two high-performance models, potentially combining their strengths.

Multi-quantization Version Support

The community provides various quantization versions, including GGUF and exl2 formats, for easy deployment across different hardware environments.

Model Capabilities

Text Generation

Reasoning Ability

Common Sense Understanding

Mathematical Calculation

Use Cases

Academic Research

Scientific Problem Solving

Answers complex scientific questions

Achieved a normalized accuracy of 73.89 in the AI2 Reasoning Challenge

Common Sense Reasoning

Common Sense Q&A

Answers questions requiring common sense judgment

Achieved a normalized accuracy of 89.37 in the HellaSwag test

Mathematical Problem Solving

Mathematical Calculation

Solves mathematical problems

Achieved an accuracy of 67.32 in the GSM8k test

🚀 BoreanGale-70B

BoreanGale-70B is a merged model using a custom algorithm (NearSwap), which combines the strengths of 152334H/miqu-1-70b-sf and Sao10K/WinterGoddess-1.4x-70B-L2. It shows promising performance in various text - generation tasks.

Model Image

🚀 Quick Start

This section provides an overview of the BoreanGale - 70B model, including its composition, available quantizations, algorithm details, license, and evaluation results.

✨ Features

Custom Merge Algorithm: Utilizes the NearSwap algorithm to combine two base models effectively.
Multiple Quantizations: Thanks to community efforts, several quantization types are available.
Good Performance: Demonstrates good performance in multiple text - generation tasks on the Open LLM Leaderboard.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples are provided in the original document, so this section is skipped.

📚 Documentation

Model Composition

BoreanGale - 70B is a merge using a custom algorithm (NearSwap) of:

Available Quants

Several quants are available thanks to community efforts:

Type	Misc	Author
GGUF	iMat Q3	Nexesenex
GGUF	iMat	mradermacher
GGUF	Full Set	mradermacher
GGUF	Misc	LoneStriker
exl2	2.4 bpw	LoneStriker
exl2	3.5 bpw	LoneStriker
exl2	4.0 bpw	LoneStriker
exl2	4.65 bpw	LoneStriker

NearSwap Algorithm

NearSwap retains most of the weights of the base model (Miqu), but when a weight is similar between the two, it is interpolated to the secondary model (WinterGoddess) value. A parameter t specifies the sameness threshold. When the distance between two values is below t, the weight from the secondary model (WinterGoddess) is used.

This version of the model uses t = 0.001. At this t, about 10% of weights are fully switched to WinterGoddess. Model quality rapidly degrades above t = 0.0025:

t = 0.0001 (~0.8% full swap): QuartetAnemoi-70B-t0.0001
t = 0.0003 (~2% full swap)
t = 0.001 (~10% full swap): This model
t = 0.0025 (~18% full swap): Generates one paragraph okay, but then reverts to garbage
t = 0.005 (~35% full swap): Garbage; semi - related word lists
t = 0.01 (~55% full swap): Garbage; pseudorandom tokens output

NearSwap implementation:

    t: Union[float, np.ndarray],
    v0: Union[np.ndarray, torch.Tensor],
    v1: Union[np.ndarray, torch.Tensor],
...
    lweight = numpy.absolute(v0-v1)
    lweight = t / lweight
    lweight = numpy.nan_to_num(lweight, nan=1.0, posinf=1.0, neginf=1.0)
    numpy.clip(lweight, a_min=0.0, a_max=1.0, out=lweight)
    res = lerp(lweight,v0,v1)

License and Use

Since the ultimate origin of Miqu is at this time unknown beyond speculation, this model is for noncommercial research use only.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	76.48
AI2 Reasoning Challenge (25 - Shot)	73.89
HellaSwag (10 - Shot)	89.37
MMLU (5 - Shot)	75.19
TruthfulQA (0 - shot)	68.6
Winogrande (5 - shot)	84.53
GSM8k (5 - shot)	67.32

🔧 Technical Details

The NearSwap algorithm is a key technical aspect of this model. It carefully controls the weight interpolation between two models based on the similarity of weights. By adjusting the parameter t, the proportion of weights switched from the base model to the secondary model can be controlled. However, the model quality is highly sensitive to the value of t, and performance degrades rapidly when t exceeds 0.0025.

📄 License

This model is for noncommercial research use only due to the uncertain origin of Miqu.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご