Hyperion-3.0-Mistral-7B-DPO開源模型 - 免費問答、代碼生成及多領域推理

Home

Hyperion 3.0 Mistral 7B DPO

Developed by Locutusque

基於Mistral-7B的DPO優化模型，擅長問答、代碼生成及多領域推理任務

大型語言模型

Transformers

EnglishOpen Source License:Apache-2.0 #DPO優化推理 #多領域專家級問答 #STEM精準建模

Downloads 15

Release Time : 3/24/2024

Model Overview

通過直接偏好優化(DPO)技術微調的高性能語言模型，專注於複雜推理、編程輔助和專業領域問題求解

Model Features

DPO優化

使用GPT-4生成的20,000組高質量偏好對數據進行直接偏好優化

多領域能力

在STEM、社會科學及人文學科領域均展現出色表現

專業推理

特別強化數學推導和邏輯推理能力，可處理複雜科學問題

Model Capabilities

文本生成

技術問答

代碼生成

醫學文本分析

數學問題求解

邏輯推理

多輪對話

Use Cases

教育

物理教學輔助

解析力學問題並建立微分方程

如示例所示可完整推導拋體運動方程

軟件開發

代碼生成

根據自然語言描述生成可執行代碼

醫療

醫學文本分析

解析專業醫學文獻並提取關鍵信息

🚀 Hyperion-3.0-Mistral-7B-DPO

Hyperion-3.0-Mistral-7B-DPO 是一個經過精細微調的語言模型，它在多種複雜任務中表現出色，如問答、對話、代碼生成等。通過使用精心挑選的訓練數據和優化技術，該模型能夠提供高質量的輸出，滿足不同領域的需求。

🚀 快速開始

以下是使用 Hyperion-3.0-Mistral-7B-DPO 進行文本生成的基本代碼示例：

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Locutusque/Hyperion-3.0-Mistral-7B-DPO"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# For a text generation task
input_text = "<|im_start|>user\nExplain the implications of quantum entanglement in layman's terms.<|im_end|>\n<|im_start|>assistant\n"
input_ids = tokenizer.encode(input_text, return_tensors="pt")

# Generate a response
outputs = model.generate(input_ids, max_length=200, do_sample=True, top_p=0.7, top_k=6) # These are the recommended sample settings.
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

✨ 主要特性

多領域應用：支持問答、對話式 AI、代碼生成、醫學文本理解、數學推理和邏輯推理等多種複雜任務。
高質量數據訓練：使用由 GPT - 4 生成的 20,000 個精心挑選的偏好對數據集進行微調，確保模型輸出的質量和相關性。
符合人類偏好：通過直接偏好優化（DPO）進一步優化訓練數據，使模型輸出更符合人類偏好。

📦 安裝指南

文檔中未提及具體安裝步驟，若需使用該模型，可參考上述快速開始部分的代碼示例，確保已安裝 transformers 庫。

💻 使用示例

基礎用法

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Locutusque/Hyperion-3.0-Mistral-7B-DPO"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# For a text generation task
input_text = "<|im_start|>user\nExplain the implications of quantum entanglement in layman's terms.<|im_end|>\n<|im_start|>assistant\n"
input_ids = tokenizer.encode(input_text, return_tensors="pt")

# Generate a response
outputs = model.generate(input_ids, max_length=200, do_sample=True, top_p=0.7, top_k=6) # These are the recommended sample settings.
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

高級用法

文檔中未提及高級用法相關代碼示例。

📚 詳細文檔

模型詳情

屬性	詳情
模型名稱	Locutusque/Hyperion-3.0-Mistral-7B-DPO
基礎模型	mistralai/Mistral-7B-v0.1
發佈者	Locutusque
模型類型	問答、對話式 AI、代碼生成、醫學文本理解、數學推理、邏輯推理
語言	多領域、英語
許可證	Apache - 2.0

預期用途

該模型適用於研究人員、開發者和組織，可用於解決各領域的挑戰性問題。潛在用例包括：

科學、醫學、數學和計算機科學領域的智能輔導系統和教育應用。
技術支持、客戶服務和特定領域聊天機器人的高級對話式 AI。
軟件開發和編程輔助的代碼生成與分析工具。
醫療專業人員和研究人員的醫學文本分析與信息檢索。
學術界和工業界的數學問題解決和邏輯推理應用。

訓練數據

Locutusque/Hyperion-3.0-Mistral-7B-DPO 模型在精心挑選的 20,000 個偏好對數據集上進行微調，其中 4,000 個示例用於微調。這些示例由 GPT - 4 生成，涵蓋編程、醫學文本、數學問題和推理任務等多個領域。訓練數據通過直接偏好優化（DPO）進一步優化，使模型輸出更符合人類偏好，提高整體性能。

量化版本

ExLlamaV2: https://huggingface.co/bartowski/Hyperion-3.0-Mistral-7B-DPO-exl2
GGUF: https://huggingface.co/bartowski/Hyperion-3.0-Mistral-7B-DPO-GGUF

評估結果

任務評估

任務	版本	過濾器	指標	值		標準誤差
mmlu_flan_cot_fewshot	N/A	get - answer	exact_match	0.5833	±	0.0118
- mmlu_flan_cot_fewshot_humanities	N/A	get - answer	exact_match	0.5039	±	0.0205
- mmlu_flan_cot_fewshot_formal_logic	0	get - answer	exact_match	0.2143	±	0.1138
- mmlu_flan_cot_fewshot_high_school_european_history	0	get - answer	exact_match	0.6667	±	0.1143
- mmlu_flan_cot_fewshot_high_school_us_history	0	get - answer	exact_match	0.7727	±	0.0914
- mmlu_flan_cot_fewshot_high_school_world_history	0	get - answer	exact_match	0.5385	±	0.0997
- mmlu_flan_cot_fewshot_international_law	0	get - answer	exact_match	0.9231	±	0.0769
- mmlu_flan_cot_fewshot_jurisprudence	0	get - answer	exact_match	0.5455	±	0.1575
- mmlu_flan_cot_fewshot_logical_fallacies	0	get - answer	exact_match	0.7778	±	0.1008
- mmlu_flan_cot_fewshot_moral_disputes	0	get - answer	exact_match	0.5526	±	0.0817
- mmlu_flan_cot_fewshot_moral_scenarios	0	get - answer	exact_match	0.4000	±	0.0492
- mmlu_flan_cot_fewshot_philosophy	0	get - answer	exact_match	0.7647	±	0.0738
- mmlu_flan_cot_fewshot_prehistory	0	get - answer	exact_match	0.6571	±	0.0814
- mmlu_flan_cot_fewshot_professional_law	0	get - answer	exact_match	0.3294	±	0.0362
- mmlu_flan_cot_fewshot_world_religions	0	get - answer	exact_match	0.8947	±	0.0723
- mmlu_flan_cot_fewshot_other	N/A	get - answer	exact_match	0.6833	±	0.0244
- mmlu_flan_cot_fewshot_business_ethics	0	get - answer	exact_match	0.9091	±	0.0909
- mmlu_flan_cot_fewshot_clinical_knowledge	0	get - answer	exact_match	0.5862	±	0.0931
- mmlu_flan_cot_fewshot_college_medicine	0	get - answer	exact_match	0.6364	±	0.1050
- mmlu_flan_cot_fewshot_global_facts	0	get - answer	exact_match	0.6000	±	0.1633
- mmlu_flan_cot_fewshot_human_aging	0	get - answer	exact_match	0.6087	±	0.1041
- mmlu_flan_cot_fewshot_management	0	get - answer	exact_match	0.9091	±	0.0909
- mmlu_flan_cot_fewshot_marketing	0	get - answer	exact_match	0.8000	±	0.0816
- mmlu_flan_cot_fewshot_medical_genetics	0	get - answer	exact_match	1.0000	±	0.0000
- mmlu_flan_cot_fewshot_miscellaneous	0	get - answer	exact_match	0.8023	±	0.0432
- mmlu_flan_cot_fewshot_nutrition	0	get - answer	exact_match	0.6667	±	0.0833
- mmlu_flan_cot_fewshot_professional_accounting	0	get - answer	exact_match	0.4839	±	0.0912
- mmlu_flan_cot_fewshot_professional_medicine	0	get - answer	exact_match	0.5806	±	0.0901
- mmlu_flan_cot_fewshot_virology	0	get - answer	exact_match	0.3889	±	0.1182
- mmlu_flan_cot_fewshot_social_sciences	N/A	get - answer	exact_match	0.7003	±	0.0239
- mmlu_flan_cot_fewshot_econometrics	0	get - answer	exact_match	0.4167	±	0.1486
- mmlu_flan_cot_fewshot_high_school_geography	0	get - answer	exact_match	0.9091	±	0.0627
- mmlu_flan_cot_fewshot_high_school_government_and_politics	0	get - answer	exact_match	0.8095	±	0.0878
- mmlu_flan_cot_fewshot_high_school_macroeconomics	0	get - answer	exact_match	0.6512	±	0.0735
- mmlu_flan_cot_fewshot_high_school_microeconomics	0	get - answer	exact_match	0.5769	±	0.0988
- mmlu_flan_cot_fewshot_high_school_psychology	0	get - answer	exact_match	0.9000	±	0.0391
- mmlu_flan_cot_fewshot_human_sexuality	0	get - answer	exact_match	0.6667	±	0.1421
- mmlu_flan_cot_fewshot_professional_psychology	0	get - answer	exact_match	0.6522	±	0.0578
- mmlu_flan_cot_fewshot_public_relations	0	get - answer	exact_match	0.5833	±	0.1486
- mmlu_flan_cot_fewshot_security_studies	0	get - answer	exact_match	0.4074	±	0.0964
- mmlu_flan_cot_fewshot_sociology	0	get - answer	exact_match	0.8182	±	0.0842
- mmlu_flan_cot_fewshot_us_foreign_policy	0	get - answer	exact_match	0.7273	±	0.1408
- mmlu_flan_cot_fewshot_stem	N/A	get - answer	exact_match	0.4866	±	0.0262
- mmlu_flan_cot_fewshot_abstract_algebra	0	get - answer	exact_match	0.0909	±	0.0909
- mmlu_flan_cot_fewshot_anatomy	0	get - answer	exact_match	0.4286	±	0.1373
- mmlu_flan_cot_fewshot_astronomy	0	get - answer	exact_match	0.5625	±	0.1281
- mmlu_flan_cot_fewshot_college_biology	0	get - answer	exact_match	0.5000	±	0.1291
- mmlu_flan_cot_fewshot_college_chemistry	0	get - answer	exact_match	0.5000	±	0.1890
- mmlu_flan_cot_fewshot_college_computer_science	0	get - answer	exact_match	0.2727	±	0.1408
- mmlu_flan_cot_fewshot_college_mathematics	0	get - answer	exact_match	0.3636	±	0.1521
- mmlu_flan_cot_fewshot_college_physics	0	get - answer	exact_match	0.3636	±	0.1521
- mmlu_flan_cot_fewshot_computer_security	0	get - answer	exact_match	0.7273	±	0.1408
- mmlu_flan_cot_fewshot_conceptual_physics	0	get - answer	exact_match	0.6538	±	0.0951
- mmlu_flan_cot_fewshot_electrical_engineering	0	get - answer	exact_match	0.7500	±	0.1118
- mmlu_flan_cot_fewshot_elementary_mathematics	0	get - answer	exact_match	0.7317	±	0.0701
- mmlu_flan_cot_fewshot_high_school_biology	0	get - answer	exact_match	0.5938	±	0.0882
- mmlu_flan_cot_fewshot_high_school_chemistry	0	get - answer	exact_match	0.3636	±	0.1050
- mmlu_flan_cot_fewshot_high_school_computer_science	0	get - answer	exact_match	0.5556	±	0.1757
- mmlu_flan_cot_fewshot_high_school_mathematics	0	get - answer	exact_match	0.3103	±	0.0874
- mmlu_flan_cot_fewshot_high_school_physics	0	get - answer	exact_match	0.2353	±	0.1060
- mmlu_flan_cot_fewshot_high_school_statistics	0	get - answer	exact_match	0.3043	±	0.0981
- mmlu_flan_cot_fewshot_machine_learning	0	get - answer	exact_match	0.4545	±	0.1575

分組評估

分組	版本	過濾器	指標	值		標準誤差
mmlu_flan_cot_fewshot	N/A	get - answer	exact_match	0.5833	±	0.0118
- mmlu_flan_cot_fewshot_humanities	N/A	get - answer	exact_match	0.5039	±	0.0205
- mmlu_flan_cot_fewshot_other	N/A	get - answer	exact_match	0.6833	±	0.0244
- mmlu_flan_cot_fewshot_social_sciences	N/A	get - answer	exact_match	0.7003	±	0.0239
- mmlu_flan_cot_fewshot_stem	N/A	get - answer	exact_match	0.4866	±	0.0262