Sonya-7B Open-Source Large Language Model - Free to Use, Ranks Second in MT-Bench, Surpassing GPT-4

Sonya 7B

Developed by SanjiWatsuki

Sonya-7B is a 7B-parameter large language model that excels in the MT-Bench benchmark, surpassing GPT-4 in the first round and ranking second overall.

Large Language Model

Transformers

English#MT-bench top performer #Multitasking all-rounder #Role-play optimization

Downloads 4,370

Release Time : 12/31/2023

Model Overview

Sonya-7B is a versatile large language model suitable for various tasks, including assistant functions and role-playing. It was created by merging multiple high-quality models without additional training.

Model Features

Outstanding benchmark performance

Surpassed GPT-4 in the first round of MT-Bench testing and ranked second overall

Multi-model merging

Combines the strengths of multiple excellent models including xDAN, Stealth v1.2, and Piano-Medley

Long context support

Supports 8192 context window with experimental support for 16384 context

Versatile applications

Suitable for various scenarios including assistants and role-playing

Model Capabilities

Text generation

Dialogue systems

Role-playing

Q&A systems

Creative writing

Use Cases

Intelligent assistant

Daily Q&A

Answers various user questions

Performed excellently in MT-Bench testing

Role-playing

Interactive story creation

Engages in role-playing dialogues

Incorporated role-play models to enhance writing capabilities

🚀 Sonya-7B

Sonya-7B is currently the #1 model in the first turn of MT-Bench, outperforming GPT-4, and ranks #2 overall in MT-Bench. It serves as a versatile model suitable for various tasks, such as assistant services and role - playing.

🚀 Quick Start

Based on the parent models, this model is expected to be used with an 8192 context window. You can experimentally try a 16384 context by using NTK scaling alpha of 2.6.

✨ Features

Outstanding Performance

Sonya-7B significantly outperforms its parent models on MT - Bench. In the MT - Bench average turn, it scores 8.52, trailing only behind GPT - 4.

model	score	size
gpt - 4	8.99	-
Sonya - 7B	8.52	7b
xDAN - L1 - Chat - RL - v1	8.34	7b
Starling - 7B	8.09	7b
Claude - 2	8.06	-
Silicon - Maid	7.96	7b
Loyal - Macaroni - Maid	7.95	7b
gpt - 3.5 - turbo	7.94	20b?
Claude - 1	7.90	-
OpenChat - 3.5	7.81	-
vicuna - 33b - v1.3	7.12	33b
wizardlm - 30b	7.01	30b
Llama - 2 - 70b - chat	6.86	70b

Model Composition

It's a merge of [xDAN - AI/xDAN - L1 - Chat - RL - v1](https://huggingface.co/xDAN - AI/xDAN - L1 - Chat - RL - v1), [Jan - Ai's Stealth v1.2](https://huggingface.co/jan - hq/stealth - v1.2), [chargoddard/piano - medley - 7b](https://huggingface.co/chargoddard/piano - medley - 7b), [NeverSleep/Noromaid - 7B - v0.2](https://huggingface.co/NeverSleep/Noromaid - 7b - v0.2), and [athirdpath/NSFW_DPO_vmgb - 7b](athirdpath/NSFW_DPO_vmgb - 7b).

Selection Rationale

MT - Bench Correlation: MT - Bench is usually well - correlated with real - world model quality, and xDAN performs well on it.
Prompt Consistency: Most models in the mix use Alpaca prompt formatting, ensuring prompt consistency.
Magic Ingredient: Stealth v1.2 seems to boost MT - Bench scores.
RP Enhancement: Adding RP models improves performance in Writing and Role - play benchmarks.

Other Benchmark Results

First turn

model	turn	score	size
Sonya - 7B	1	9.06875	7b
gpt - 4	1	8.95625	-
xDAN - L1 - Chat - RL - v1	1	8.87500	7b
xDAN - L2 - Chat - RL - v2	1	8.78750	30b
claude - v1	1	8.15000	-
gpt - 3.5 - turbo	1	8.07500	20b
vicuna - 33b - v1.3	1	7.45625	33b
wizardlm - 30b	1	7.13125	30b
oasst - sft - 7 - llama - 30b	1	7.10625	30b
Llama - 2 - 70b - chat	1	6.98750	70b

Second turn

model	turn	score	size
gpt - 4	2	9.025000	-
xDAN - L2 - Chat - RL - v2	2	8.087500	30b
Sonya - 7B	2	7.962500	7b
xDAN - L1 - Chat - RL - v1	2	7.825000	7b
gpt - 3.5 - turbo	2	7.812500	20b
claude - v1	2	7.650000	-
wizardlm - 30b	2	6.887500	30b
vicuna - 33b - v1.3	2	6.787500	33b
Llama - 2 - 70b - chat	2	6.725000	70b

📚 Documentation

The Sauce

models:
  - model: xDAN-AI/xDAN-L1-Chat-RL-v1
    parameters:
      weight: 1
      density: 1
  - model: chargoddard/piano-medley-7b
    parameters:
      weight: 0.3
  - model: jan-hq/stealth-v1.2
    parameters:
      weight: 0.2
  - model: NeverSleep/Noromaid-7b-v0.2
    parameters:
      weight: 0.2
  - model: athirdpath/NSFW_DPO_vmgb-7b
    parameters:
      weight: 0.2
merge_method: ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
  density: 0.4
  int8_mask: true
  normalize: true
dtype: bfloat16

Note: There was no additional training, finetuning, or DPO. This is a straight merger.

Prompt Template (Alpaca)

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

It's found that this model performed worse with the xDAN prompt format. So, despite the heavy weight of xDAN in this merger, it's recommended against its use.

Replication of MT - Bench Run

If you want to replicate the MT - Bench run, make sure to apply the Alpaca prompt template to the model. You can do this by putting "alpaca" in the model path to trigger the AlpacaAdapter.

📄 License

This project is licensed under the CC - BY - 4.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご