模型概述
模型特點
模型能力
使用案例
🚀 Beyonder-4x7B-v2
Beyonder-4x7B-v2 是一個使用 mergekit(mixtral 分支)創建的專家混合(Mixture of Experts, MoE)模型。它結合了多個基礎模型的優勢,在文本生成任務上表現出色,為用戶提供更準確、更高效的文本生成服務。
🚀 快速開始
你可以使用以下代碼在 Google Colab 上以 4 位精度運行此模型:
!pip install -qU transformers bitsandbytes accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "mlabonne/Beyonder-4x7B-v2"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)
messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
輸出結果:
A Mixture of Experts (ME) is a machine learning technique that combines multiple expert models to make predictions or decisions. Each expert model is specialized in a different aspect of the problem, and their outputs are combined to produce a more accurate and robust solution. This approach allows the model to leverage the strengths of individual experts and compensate for their weaknesses, improving overall performance.
這裡還有一個 notebook,可幫助你在 Google Colab 上使用免費的 T4 GPU 以 4 位精度運行此模型。
✨ 主要特性
- 專家混合架構:結合多個基礎模型的優勢,在文本生成任務上表現出色。
- 多場景適用:適用於多種文本生成場景,如問答、代碼生成、故事創作等。
- 高效推理:推薦上下文長度為 8k,能夠在保證性能的同時提高推理效率。
📦 量化模型
感謝 TheBloke 提供的量化模型:
- GGUF:https://huggingface.co/TheBloke/Beyonder-4x7B-v2-GGUF
- AWQ:https://huggingface.co/TheBloke/Beyonder-4x7B-v2-AWQ
- GPTQ:https://huggingface.co/TheBloke/Beyonder-4x7B-v2-GPTQ
- EXL2:https://huggingface.co/bartowski/Beyonder-4x7B-v2-exl2
🏆 評估
與 Mixtral-8x7B-Instruct-v0.1 對比
Beyonder-4x7B-v2 在 Open LLM Leaderboard 上與 Mixtral-8x7B-Instruct-v0.1 具有競爭力,且僅使用 4 個專家模型而非 8 個。
與單個專家模型對比
它相較於單個專家模型有顯著提升。
Nous 基準測試套件
在 Nous 基準測試套件中,與其他模型相比表現出色,幾乎與最佳的 Yi-34B 微調模型相當,而後者是一個更大的模型:242 億參數 + 推理時僅選擇兩個專家(約 120 億),而 Beyonder-4x7B-v2 為 340 億參數。
模型 | AGIEval | GPT4All | TruthfulQA | Bigbench | 平均 |
---|---|---|---|---|---|
Beyonder-4x7B-v2 | 45.29 | 75.95 | 60.86 | 46.4 | 57.13 |
NeuralHermes-2.5-Mistral-7B | 43.67 | 73.24 | 55.37 | 41.76 | 53.51 |
OpenHermes-2.5-Mistral-7B | 42.75 | 72.99 | 52.99 | 40.94 | 52.42 |
Nous-Hermes-2-SOLAR-10.7B | 47.79 | 74.69 | 55.92 | 44.84 | 55.81 |
Nous-Hermes-2-Yi-34B | 50.27 | 76.00 | 60.34 | 46.69 | 58.33 |
各任務詳細評估
AGIEval
任務 | 版本 | 指標 | 值 | 標準誤差 | |
---|---|---|---|---|---|
agieval_aqua_rat | 0 | acc | 23.62 | ± | 2.67 |
acc_norm | 23.62 | ± | 2.67 | ||
agieval_logiqa_en | 0 | acc | 41.47 | ± | 1.93 |
acc_norm | 43.01 | ± | 1.94 | ||
agieval_lsat_ar | 0 | acc | 23.04 | ± | 2.78 |
acc_norm | 23.48 | ± | 2.80 | ||
agieval_lsat_lr | 0 | acc | 51.57 | ± | 2.22 |
acc_norm | 52.94 | ± | 2.21 | ||
agieval_lsat_rc | 0 | acc | 64.31 | ± | 2.93 |
acc_norm | 64.68 | ± | 2.92 | ||
agieval_sat_en | 0 | acc | 79.13 | ± | 2.84 |
acc_norm | 79.13 | ± | 2.84 | ||
agieval_sat_en_without_passage | 0 | acc | 43.20 | ± | 3.46 |
acc_norm | 43.20 | ± | 3.46 | ||
agieval_sat_math | 0 | acc | 34.55 | ± | 3.21 |
acc_norm | 32.27 | ± | 3.16 |
GPT4All
任務 | 版本 | 指標 | 值 | 標準誤差 | |
---|---|---|---|---|---|
arc_challenge | 0 | acc | 61.86 | ± | 1.42 |
acc_norm | 64.51 | ± | 1.40 | ||
arc_easy | 0 | acc | 85.06 | ± | 0.73 |
acc_norm | 82.45 | ± | 0.78 | ||
boolq | 1 | acc | 88.35 | ± | 0.56 |
hellaswag | 0 | acc | 68.04 | ± | 0.47 |
acc_norm | 85.12 | ± | 0.36 | ||
openbookqa | 0 | acc | 37.80 | ± | 2.17 |
acc_norm | 48.60 | ± | 2.24 | ||
piqa | 0 | acc | 83.08 | ± | 0.87 |
acc_norm | 83.95 | ± | 0.86 | ||
winogrande | 0 | acc | 78.69 | ± | 1.15 |
TruthfulQA
任務 | 版本 | 指標 | 值 | 標準誤差 | |
---|---|---|---|---|---|
truthfulqa_mc | 1 | mc1 | 44.55 | ± | 1.74 |
mc2 | 60.86 | ± | 1.57 |
Bigbench
任務 | 版本 | 指標 | 值 | 標準誤差 | |
---|---|---|---|---|---|
bigbench_causal_judgement | 0 | multiple_choice_grade | 58.95 | ± | 3.58 |
bigbench_date_understanding | 0 | multiple_choice_grade | 66.40 | ± | 2.46 |
bigbench_disambiguation_qa | 0 | multiple_choice_grade | 48.84 | ± | 3.12 |
bigbench_geometric_shapes | 0 | multiple_choice_grade | 22.56 | ± | 2.21 |
exact_str_match | 13.37 | ± | 1.80 | ||
bigbench_logical_deduction_five_objects | 0 | multiple_choice_grade | 30.40 | ± | 2.06 |
bigbench_logical_deduction_seven_objects | 0 | multiple_choice_grade | 20.57 | ± | 1.53 |
bigbench_logical_deduction_three_objects | 0 | multiple_choice_grade | 52.00 | ± | 2.89 |
bigbench_movie_recommendation | 0 | multiple_choice_grade | 44.40 | ± | 2.22 |
bigbench_navigate | 0 | multiple_choice_grade | 52.10 | ± | 1.58 |
bigbench_reasoning_about_colored_objects | 0 | multiple_choice_grade | 69.75 | ± | 1.03 |
bigbench_ruin_names | 0 | multiple_choice_grade | 55.36 | ± | 2.35 |
bigbench_salient_translation_error_detection | 0 | multiple_choice_grade | 23.65 | ± | 1.35 |
bigbench_snarks | 0 | multiple_choice_grade | 77.35 | ± | 3.12 |
bigbench_sports_understanding | 0 | multiple_choice_grade | 73.02 | ± | 1.41 |
bigbench_temporal_sequences | 0 | multiple_choice_grade | 46.80 | ± | 1.58 |
bigbench_tracking_shuffled_objects_five_objects | 0 | multiple_choice_grade | 22.08 | ± | 1.17 |
bigbench_tracking_shuffled_objects_seven_objects | 0 | multiple_choice_grade | 19.03 | ± | 0.94 |
bigbench_tracking_shuffled_objects_three_objects | 0 | multiple_choice_grade | 52.00 | ± | 2.89 |
🧩 配置
base_model: mlabonne/Marcoro14-7B-slerp
experts:
- source_model: openchat/openchat-3.5-1210
positive_prompts:
- "chat"
- "assistant"
- "tell me"
- "explain"
- source_model: beowolx/CodeNinja-1.0-OpenChat-7B
positive_prompts:
- "code"
- "python"
- "javascript"
- "programming"
- "algorithm"
- source_model: maywell/PiVoT-0.1-Starling-LM-RP
positive_prompts:
- "storywriting"
- "write"
- "scene"
- "story"
- "character"
- source_model: WizardLM/WizardMath-7B-V1.1
positive_prompts:
- "reason"
- "math"
- "mathematics"
- "solve"
- "count"
📄 許可證
本模型使用的許可證為 microsoft-research-license,詳情請見:https://huggingface.co/WizardLM/WizardMath-7B-V1.1/resolve/main/LICENSE



