Supermario-slerp-v2 Open-source Text Generation Model - Merged by SLERP, Outstanding Performance in Multiple Benchmark Tests

Supermario Slerp V2

Developed by jan-hq

supermario-slerp-v2 is a text generation model created by merging two 7B-parameter models using the SLERP method, demonstrating outstanding performance across multiple benchmarks.

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #Multitask Text Generation #High-Accuracy Reasoning #Knowledge-Intensive QA

Downloads 15

Release Time : 12/12/2023

Model Overview

This model is created by merging the v1olet_marcoroni-go-bruins-merge-7B and juanako-7b-UNA models via the SLERP method, primarily designed for text generation tasks.

Model Features

Model Merging Technique

Uses the SLERP method to merge two distinct models, combining their respective strengths

High Performance

Excels in multiple benchmarks, such as achieving a normalized accuracy of 86.6 on HellaSwag

Open-Source Availability

Released under Apache 2.0 license, freely usable and modifiable

Model Capabilities

Text Generation

Question Answering

Reasoning Tasks

Use Cases

Education

AI2 Reasoning Challenge

Used to solve complex reasoning problems

Normalized accuracy 69.37

Commonsense Reasoning

HellaSwag Test

Evaluates the model's commonsense reasoning ability

Normalized accuracy 86.6

Mathematical Problem Solving

GSM8k Math Test

Solves elementary math problems

Accuracy 63.46

🚀 supermario-slerp-v2

supermario-slerp-v2 is a merged model that uses the Slerp method. It can be run on Jan Desktop, offering an open - source, offline, and OpenAI - compatible alternative for text generation tasks.

🚀 Quick Start

You can run this model using Jan Desktop on Mac, Windows, or Linux.

Jan is an open source, ChatGPT alternative with the following features:

💻 100% offline on your machine: Your conversations remain confidential, and visible only to you.
🗂️ An Open File Format: Conversations and model settings stay on your computer and can be exported or deleted at any time.
🌐 OpenAI Compatible: Local server on port 1337 with OpenAI compatible endpoints
🌍 Open Source & Free: We build in public; check out our Github

image/png

✨ Features

Model Merging: This model uses the Slerp merge method from 2 models:
1. [v1olet_marcoroni - go - bruins - merge - 7B](https://huggingface.co/v1olet/v1olet_marcoroni - go - bruins - merge - 7B)
2. [juanako - 7b - UNA](https://huggingface.co/fblgit/juanako - 7b - UNA)
Base Model: [v1olet_marcoroni - go - bruins - merge - 7B](https://huggingface.co/v1olet/v1olet_marcoroni - go - bruins - merge - 7B)

📚 Documentation

Model Description

This model uses the Slerp merge method from 2 models:

[v1olet_marcoroni - go - bruins - merge - 7B](https://huggingface.co/v1olet/v1olet_marcoroni - go - bruins - merge - 7B)
[juanako - 7b - UNA](https://huggingface.co/fblgit/juanako - 7b - UNA)

base model: [v1olet_marcoroni - go - bruins - merge - 7B](https://huggingface.co/v1olet/v1olet_marcoroni - go - bruins - merge - 7B)

The yaml config file for this model is here:

slices:
  - sources:
      - model: v1olet/v1olet_marcoroni - go - bruins - merge - 7B
        layer_range: [0, 32]
      - model: fblgit/juanako - 7b - UNA
        layer_range: [0, 32]
merge_method: slerp
base_model: v1olet/v1olet_marcoroni - go - bruins - merge - 7B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

About Jan

Jan believes in the need for an open - source AI ecosystem and is building the infra and tooling to allow open - source AIs to compete on a level playing field with proprietary ones.

Jan's long - term vision is to build a cognitive framework for future robots, who are practical, useful assistants for humans and businesses in everyday life.

Jan Model Merger

This is a test project for merging models.

Open LLM Leaderboard Evaluation Results

Detailed results can be found [here](https://huggingface.co/datasets/open - llm - leaderboard/details_janhq__supermario - slerp - v2)

Metric	Value
Avg.	71.35
AI2 Reasoning Challenge (25 - Shot)	69.37
HellaSwag (10 - Shot)	86.60
MMLU (5 - Shot)	64.91
TruthfulQA (0 - shot)	62.96
Winogrande (5 - shot)	80.82
GSM8k (5 - shot)	63.46

📄 License

This model is licensed under the apache - 2.0 license.

Acknowlegement

mergekit
[DARE](https://github.com/yule - BUAA/MergeLM/blob/main/README.md)
[SLERP](https://github.com/Digitous/LLM - SLERP - Merge)
[lm - evaluation - harness](https://github.com/EleutherAI/lm - evaluation - harness)

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご