Daredevil-8B Open-Source Super Fusion Model - The Llama 3 8B Model with the Highest MMLU Score in 2024

Daredevil 8B

Developed by mlabonne

Daredevil-8B is a super-fused model specifically designed to maximize MMLU scores. As of May 27, 2024, it is the highest-scoring Llama 3 8B model on MMLU.

Large Language Model

Transformers

Open Source License:Other #High MMLU score #Multi-model fusion #Knowledge reasoning

Downloads 238

Release Time : 5/25/2024

Model Overview

Daredevil-8B is a fused model based on the Llama 3 8B architecture, optimized for MMLU performance by integrating multiple high-quality models. It can be used as an enhanced version of Meta-Llama-3-8B-Instruct.

Model Features

High-performance MMLU score

Best performance on the MMLU benchmark, achieving an accuracy of 69.24

Multi-model fusion

Integrates 9 high-quality Llama 3 8B variant models

Content moderation

Content-moderated version suitable for security-sensitive applications

Model Capabilities

Text generation

Question answering

Knowledge reasoning

Dialogue systems

Use Cases

Education

Knowledge Q&A

Used for knowledge Q&A systems in the education field

Excellent performance on the MMLU benchmark

Research

Benchmarking

Used for language model performance research and benchmarking

Outstanding performance across multiple benchmarks

🚀 Daredevil-8B

Daredevil-8B is a mega-merge model aiming to maximize MMLU performance. As of May 27, 2024, it holds the highest MMLU score among Llama 3 8B models. In practice, a high MMLU score is a significant advantage for Llama 3 models.

image/jpeg

🚀 Quick Start

Daredevil-8B is a merge of multiple models using LazyMergekit. You can use it as an enhanced alternative to meta-llama/Meta-Llama-3-8B-Instruct.

✨ Features

High MMLU Score: On May 27, 2024, it achieved the highest MMLU score among Llama 3 8B models on the Open LLM Leaderboard.
Versatile Applications: Can be used as an improved version of meta-llama/Meta-Llama-3-8B-Instruct.
Censored and Uncensored Versions: A censored version is available here, and an uncensored version can be found at mlabonne/Daredevil-8B-abliterated.

📦 Installation

The model can be used directly through the Hugging Face platform. You can also use the following command to install the necessary libraries for local use:

!pip install -qU transformers accelerate

💻 Usage Examples

Basic Usage

from transformers import AutoTokenizer
import transformers
import torch

model = "mlabonne/Daredevil-8B"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

📚 Documentation

Applications

You can use it as an improved version of meta-llama/Meta-Llama-3-8B-Instruct. This is a censored model. For an uncensored version, see mlabonne/Daredevil-8B-abliterated. It has been tested on LM Studio using the "Llama 3" preset.

Quantization

GGUF: https://huggingface.co/mlabonne/Daredevil-8B-GGUF

Evaluation

Open LLM Leaderboard

Daredevil-8B is the best-performing 8B model on the Open LLM Leaderboard in terms of MMLU score (May 27, 2024). image/png

Nous

Daredevil-8B is the best-performing 8B model on Nous' benchmark suite (evaluation performed using LLM AutoEval, May 27, 2024). See the entire leaderboard here.

Model	Average	AGIEval	GPT4All	TruthfulQA	Bigbench
mlabonne/Daredevil-8B 📄	55.87	44.13	73.52	59.05	46.77
mlabonne/Daredevil-8B-abliterated 📄	55.06	43.29	73.33	57.47	46.17
mlabonne/Llama-3-8B-Instruct-abliterated-dpomix 📄	52.26	41.6	69.95	54.22	43.26
meta-llama/Meta-Llama-3-8B-Instruct 📄	51.34	41.22	69.86	51.65	42.64
failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 📄	51.21	40.23	69.5	52.44	42.69
mlabonne/OrpoLlama-3-8B 📄	48.63	34.17	70.59	52.39	37.36
meta-llama/Meta-Llama-3-8B 📄	45.42	31.1	69.95	43.91	36.7

Model Family Tree

image/png

Configuration

models:
  - model: NousResearch/Meta-Llama-3-8B
    # No parameters necessary for base model
  - model: nbeerbower/llama-3-stella-8B
    parameters:
      density: 0.6
      weight: 0.16
  - model: Hastagaras/llama-3-8b-okay
    parameters:
      density: 0.56
      weight: 0.1
  - model: nbeerbower/llama-3-gutenberg-8B
    parameters:
      density: 0.6
      weight: 0.18
  - model: openchat/openchat-3.6-8b-20240522
    parameters:
      density: 0.56
      weight: 0.12
  - model: Kukedlc/NeuralLLaMa-3-8b-DT-v0.1
    parameters:
      density: 0.58
      weight: 0.18
  - model: cstr/llama3-8b-spaetzle-v20
    parameters:
      density: 0.56
      weight: 0.08
  - model: mlabonne/ChimeraLlama-3-8B-v3
    parameters:
      density: 0.56
      weight: 0.08
  - model: flammenai/Mahou-1.1-llama3-8B
    parameters:
      density: 0.55
      weight: 0.05
  - model: KingNish/KingNish-Llama3-8b
    parameters:
      density: 0.55
      weight: 0.05
merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3-8B
dtype: bfloat16

📄 License

The license for this model is "other".

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご