Chicka-Mixtral-3x7b開源大模型 - 輕鬆搞定對話、代碼與數學任務

首頁

Chicka Mixtral 3x7b

由Chickaboo開發

基於3個Mistral架構模型的專家混合大語言模型，擅長對話、代碼和數學任務

大型語言模型

Transformers

開源協議:MIT #專家混合模型 #多領域對話 #數學推理增強

下載量 77

發布時間 : 4/22/2024

模型概述

本模型為基於3個Mistral架構模型的專家混合大語言模型，包含基礎對話、代碼和數學三個專家模塊，可根據不同任務自動切換最優專家

模型特點

專家混合架構

整合了對話、代碼和數學三個專業領域的專家模型，根據輸入內容自動選擇最優專家

智能觸發機制

通過關鍵詞自動識別任務類型並激活相應專家模塊

高性能表現

在多個基準測試中超越同類7B/8B規模模型

模型能力

自然語言對話

代碼生成與解釋

數學問題求解

多輪對話

文本理解與生成

使用案例

開發輔助

代碼生成

根據自然語言描述生成多種編程語言的代碼

支持Python、JavaScript、C++等多種語言

代碼調試

幫助開發者理解並修復代碼錯誤

可解釋運行時錯誤並提供解決方案

教育

數學輔導

解答數學問題並展示解題步驟

在GSM8K數學測試中獲得70.66分

概念解釋

用通俗語言解釋複雜概念

適合不同知識水平的學習者

智能助手

日常問答

回答各種日常問題並提供建議

在真實問答測試中獲得50.51分

食譜推薦

根據用戶需求提供烹飪建議和食譜

可生成詳細的烹飪步驟

🚀 Chicka-Mistral-3x7b模型

Chicka-Mistral-3x7b是一個基於混合專家（Mixture of Experts）技術融合的大語言模型，它整合了三個基於Mistral架構的模型，在多種自然語言處理任務中展現出卓越的性能。

🚀 快速開始

使用以下Python代碼示例，你可以快速加載並使用Chicka-Mistral-3x7b模型進行對話生成：

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("Chickaboo/Chicka-Mistral-3x7b")
tokenizer = AutoTokenizer.from_pretrained("Chickaboo/Chicka-Mixtral-3x7b")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

✨ 主要特性

混合專家架構：融合了三個基於Mistral的模型，分別在對話、代碼和數學領域具有專長，實現了多領域能力的增強。
高性能表現：在多個基準測試中表現出色，如ARC、Hellaswag、TruthfulQA等，展現了其在知識理解、推理和生成方面的強大能力。

📦 安裝指南

本README未提供具體安裝步驟，你可以參考transformers庫的官方文檔進行模型的安裝和使用。

💻 使用示例

基礎用法

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("Chickaboo/Chicka-Mistral-3x7b")
tokenizer = AutoTokenizer.from_pretrained("Chickaboo/Chicka-Mixtral-3x7b")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

📚 詳細文檔

模型描述

該模型是一個基於混合專家（Mixture of Experts）技術融合的大語言模型，由三個基於Mistral的模型組成：

基礎模型/對話專家：openchat/openchat-3.5-0106
代碼專家：beowolx/CodeNinja-1.0-OpenChat-7B
數學專家：meta-math/MetaMath-Mistral-7B

以下是合併過程中使用的Mergekit配置：

base_model: openchat/openchat-3.5-0106
experts:
  - source_model: openchat/openchat-3.5-0106
    positive_prompts:
    - "chat"
    - "assistant"
    - "tell me"
    - "explain"
    - "I want"
  - source_model: beowolx/CodeNinja-1.0-OpenChat-7B
    positive_prompts:
    - "code"
    - "python"
    - "javascript"
    - "programming"
    - "algorithm"
    - "C#"
    - "C++"
    - "debug"
    - "runtime"
    - "html"
    - "command"
    - "nodejs"
  - source_model: meta-math/MetaMath-Mistral-7B
    positive_prompts:
    - "reason"
    - "math"
    - "mathematics"
    - "solve"
    - "count"
    - "calculate"
    - "arithmetic"
    - "algebra"

開放大語言模型排行榜

基準測試	Chicka-Mixtral-3X7B	Mistral-7B-Instruct-v0.2	Meta-Llama-3-8B
平均分	69.19	60.97	62.55
ARC	64.08	59.98	59.47
Hellaswag	83.96	83.31	82.09
MMLU	64.87	64.16	66.67
TruthfulQA	50.51	42.15	43.95
Winogrande	81.06	78.37	77.35
GSM8K	70.66	37.83	45.79