Hyperion-3.0-Mistral-7B-DPO开源模型 - 免费问答、代码生成及多领域推理

首页

Hyperion 3.0 Mistral 7B DPO

由 Locutusque 开发

基于Mistral-7B的DPO优化模型，擅长问答、代码生成及多领域推理任务

大型语言模型

Transformers

英语开源协议:Apache-2.0 #DPO优化推理 #多领域专家级问答 #STEM精准建模

下载量 15

发布时间 : 3/24/2024

模型简介

通过直接偏好优化(DPO)技术微调的高性能语言模型，专注于复杂推理、编程辅助和专业领域问题求解

模型特点

DPO优化

使用GPT-4生成的20,000组高质量偏好对数据进行直接偏好优化

多领域能力

在STEM、社会科学及人文学科领域均展现出色表现

专业推理

特别强化数学推导和逻辑推理能力，可处理复杂科学问题

模型能力

文本生成

技术问答

代码生成

医学文本分析

数学问题求解

逻辑推理

多轮对话

使用案例

教育

物理教学辅助

解析力学问题并建立微分方程

如示例所示可完整推导抛体运动方程

软件开发

代码生成

根据自然语言描述生成可执行代码

医疗

医学文本分析

解析专业医学文献并提取关键信息

🚀 Hyperion-3.0-Mistral-7B-DPO

Hyperion-3.0-Mistral-7B-DPO 是一个经过精细微调的语言模型，它在多种复杂任务中表现出色，如问答、对话、代码生成等。通过使用精心挑选的训练数据和优化技术，该模型能够提供高质量的输出，满足不同领域的需求。

🚀 快速开始

以下是使用 Hyperion-3.0-Mistral-7B-DPO 进行文本生成的基本代码示例：

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Locutusque/Hyperion-3.0-Mistral-7B-DPO"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# For a text generation task
input_text = "<|im_start|>user\nExplain the implications of quantum entanglement in layman's terms.<|im_end|>\n<|im_start|>assistant\n"
input_ids = tokenizer.encode(input_text, return_tensors="pt")

# Generate a response
outputs = model.generate(input_ids, max_length=200, do_sample=True, top_p=0.7, top_k=6) # These are the recommended sample settings.
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

✨ 主要特性

多领域应用：支持问答、对话式 AI、代码生成、医学文本理解、数学推理和逻辑推理等多种复杂任务。
高质量数据训练：使用由 GPT - 4 生成的 20,000 个精心挑选的偏好对数据集进行微调，确保模型输出的质量和相关性。
符合人类偏好：通过直接偏好优化（DPO）进一步优化训练数据，使模型输出更符合人类偏好。

📦 安装指南

文档中未提及具体安装步骤，若需使用该模型，可参考上述快速开始部分的代码示例，确保已安装 transformers 库。

💻 使用示例

基础用法

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Locutusque/Hyperion-3.0-Mistral-7B-DPO"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# For a text generation task
input_text = "<|im_start|>user\nExplain the implications of quantum entanglement in layman's terms.<|im_end|>\n<|im_start|>assistant\n"
input_ids = tokenizer.encode(input_text, return_tensors="pt")

# Generate a response
outputs = model.generate(input_ids, max_length=200, do_sample=True, top_p=0.7, top_k=6) # These are the recommended sample settings.
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

高级用法

文档中未提及高级用法相关代码示例。

📚 详细文档

模型详情

属性	详情
模型名称	Locutusque/Hyperion-3.0-Mistral-7B-DPO
基础模型	mistralai/Mistral-7B-v0.1
发布者	Locutusque
模型类型	问答、对话式 AI、代码生成、医学文本理解、数学推理、逻辑推理
语言	多领域、英语
许可证	Apache - 2.0

预期用途

该模型适用于研究人员、开发者和组织，可用于解决各领域的挑战性问题。潜在用例包括：

科学、医学、数学和计算机科学领域的智能辅导系统和教育应用。
技术支持、客户服务和特定领域聊天机器人的高级对话式 AI。
软件开发和编程辅助的代码生成与分析工具。
医疗专业人员和研究人员的医学文本分析与信息检索。
学术界和工业界的数学问题解决和逻辑推理应用。

训练数据

Locutusque/Hyperion-3.0-Mistral-7B-DPO 模型在精心挑选的 20,000 个偏好对数据集上进行微调，其中 4,000 个示例用于微调。这些示例由 GPT - 4 生成，涵盖编程、医学文本、数学问题和推理任务等多个领域。训练数据通过直接偏好优化（DPO）进一步优化，使模型输出更符合人类偏好，提高整体性能。

量化版本

ExLlamaV2: https://huggingface.co/bartowski/Hyperion-3.0-Mistral-7B-DPO-exl2
GGUF: https://huggingface.co/bartowski/Hyperion-3.0-Mistral-7B-DPO-GGUF

评估结果

任务评估

任务	版本	过滤器	指标	值		标准误差
mmlu_flan_cot_fewshot	N/A	get - answer	exact_match	0.5833	±	0.0118
- mmlu_flan_cot_fewshot_humanities	N/A	get - answer	exact_match	0.5039	±	0.0205
- mmlu_flan_cot_fewshot_formal_logic	0	get - answer	exact_match	0.2143	±	0.1138
- mmlu_flan_cot_fewshot_high_school_european_history	0	get - answer	exact_match	0.6667	±	0.1143
- mmlu_flan_cot_fewshot_high_school_us_history	0	get - answer	exact_match	0.7727	±	0.0914
- mmlu_flan_cot_fewshot_high_school_world_history	0	get - answer	exact_match	0.5385	±	0.0997
- mmlu_flan_cot_fewshot_international_law	0	get - answer	exact_match	0.9231	±	0.0769
- mmlu_flan_cot_fewshot_jurisprudence	0	get - answer	exact_match	0.5455	±	0.1575
- mmlu_flan_cot_fewshot_logical_fallacies	0	get - answer	exact_match	0.7778	±	0.1008
- mmlu_flan_cot_fewshot_moral_disputes	0	get - answer	exact_match	0.5526	±	0.0817
- mmlu_flan_cot_fewshot_moral_scenarios	0	get - answer	exact_match	0.4000	±	0.0492
- mmlu_flan_cot_fewshot_philosophy	0	get - answer	exact_match	0.7647	±	0.0738
- mmlu_flan_cot_fewshot_prehistory	0	get - answer	exact_match	0.6571	±	0.0814
- mmlu_flan_cot_fewshot_professional_law	0	get - answer	exact_match	0.3294	±	0.0362
- mmlu_flan_cot_fewshot_world_religions	0	get - answer	exact_match	0.8947	±	0.0723
- mmlu_flan_cot_fewshot_other	N/A	get - answer	exact_match	0.6833	±	0.0244
- mmlu_flan_cot_fewshot_business_ethics	0	get - answer	exact_match	0.9091	±	0.0909
- mmlu_flan_cot_fewshot_clinical_knowledge	0	get - answer	exact_match	0.5862	±	0.0931
- mmlu_flan_cot_fewshot_college_medicine	0	get - answer	exact_match	0.6364	±	0.1050
- mmlu_flan_cot_fewshot_global_facts	0	get - answer	exact_match	0.6000	±	0.1633
- mmlu_flan_cot_fewshot_human_aging	0	get - answer	exact_match	0.6087	±	0.1041
- mmlu_flan_cot_fewshot_management	0	get - answer	exact_match	0.9091	±	0.0909
- mmlu_flan_cot_fewshot_marketing	0	get - answer	exact_match	0.8000	±	0.0816
- mmlu_flan_cot_fewshot_medical_genetics	0	get - answer	exact_match	1.0000	±	0.0000
- mmlu_flan_cot_fewshot_miscellaneous	0	get - answer	exact_match	0.8023	±	0.0432
- mmlu_flan_cot_fewshot_nutrition	0	get - answer	exact_match	0.6667	±	0.0833
- mmlu_flan_cot_fewshot_professional_accounting	0	get - answer	exact_match	0.4839	±	0.0912
- mmlu_flan_cot_fewshot_professional_medicine	0	get - answer	exact_match	0.5806	±	0.0901
- mmlu_flan_cot_fewshot_virology	0	get - answer	exact_match	0.3889	±	0.1182
- mmlu_flan_cot_fewshot_social_sciences	N/A	get - answer	exact_match	0.7003	±	0.0239
- mmlu_flan_cot_fewshot_econometrics	0	get - answer	exact_match	0.4167	±	0.1486
- mmlu_flan_cot_fewshot_high_school_geography	0	get - answer	exact_match	0.9091	±	0.0627
- mmlu_flan_cot_fewshot_high_school_government_and_politics	0	get - answer	exact_match	0.8095	±	0.0878
- mmlu_flan_cot_fewshot_high_school_macroeconomics	0	get - answer	exact_match	0.6512	±	0.0735
- mmlu_flan_cot_fewshot_high_school_microeconomics	0	get - answer	exact_match	0.5769	±	0.0988
- mmlu_flan_cot_fewshot_high_school_psychology	0	get - answer	exact_match	0.9000	±	0.0391
- mmlu_flan_cot_fewshot_human_sexuality	0	get - answer	exact_match	0.6667	±	0.1421
- mmlu_flan_cot_fewshot_professional_psychology	0	get - answer	exact_match	0.6522	±	0.0578
- mmlu_flan_cot_fewshot_public_relations	0	get - answer	exact_match	0.5833	±	0.1486
- mmlu_flan_cot_fewshot_security_studies	0	get - answer	exact_match	0.4074	±	0.0964
- mmlu_flan_cot_fewshot_sociology	0	get - answer	exact_match	0.8182	±	0.0842
- mmlu_flan_cot_fewshot_us_foreign_policy	0	get - answer	exact_match	0.7273	±	0.1408
- mmlu_flan_cot_fewshot_stem	N/A	get - answer	exact_match	0.4866	±	0.0262
- mmlu_flan_cot_fewshot_abstract_algebra	0	get - answer	exact_match	0.0909	±	0.0909
- mmlu_flan_cot_fewshot_anatomy	0	get - answer	exact_match	0.4286	±	0.1373
- mmlu_flan_cot_fewshot_astronomy	0	get - answer	exact_match	0.5625	±	0.1281
- mmlu_flan_cot_fewshot_college_biology	0	get - answer	exact_match	0.5000	±	0.1291
- mmlu_flan_cot_fewshot_college_chemistry	0	get - answer	exact_match	0.5000	±	0.1890
- mmlu_flan_cot_fewshot_college_computer_science	0	get - answer	exact_match	0.2727	±	0.1408
- mmlu_flan_cot_fewshot_college_mathematics	0	get - answer	exact_match	0.3636	±	0.1521
- mmlu_flan_cot_fewshot_college_physics	0	get - answer	exact_match	0.3636	±	0.1521
- mmlu_flan_cot_fewshot_computer_security	0	get - answer	exact_match	0.7273	±	0.1408
- mmlu_flan_cot_fewshot_conceptual_physics	0	get - answer	exact_match	0.6538	±	0.0951
- mmlu_flan_cot_fewshot_electrical_engineering	0	get - answer	exact_match	0.7500	±	0.1118
- mmlu_flan_cot_fewshot_elementary_mathematics	0	get - answer	exact_match	0.7317	±	0.0701
- mmlu_flan_cot_fewshot_high_school_biology	0	get - answer	exact_match	0.5938	±	0.0882
- mmlu_flan_cot_fewshot_high_school_chemistry	0	get - answer	exact_match	0.3636	±	0.1050
- mmlu_flan_cot_fewshot_high_school_computer_science	0	get - answer	exact_match	0.5556	±	0.1757
- mmlu_flan_cot_fewshot_high_school_mathematics	0	get - answer	exact_match	0.3103	±	0.0874
- mmlu_flan_cot_fewshot_high_school_physics	0	get - answer	exact_match	0.2353	±	0.1060
- mmlu_flan_cot_fewshot_high_school_statistics	0	get - answer	exact_match	0.3043	±	0.0981
- mmlu_flan_cot_fewshot_machine_learning	0	get - answer	exact_match	0.4545	±	0.1575

分组评估

分组	版本	过滤器	指标	值		标准误差
mmlu_flan_cot_fewshot	N/A	get - answer	exact_match	0.5833	±	0.0118
- mmlu_flan_cot_fewshot_humanities	N/A	get - answer	exact_match	0.5039	±	0.0205
- mmlu_flan_cot_fewshot_other	N/A	get - answer	exact_match	0.6833	±	0.0244
- mmlu_flan_cot_fewshot_social_sciences	N/A	get - answer	exact_match	0.7003	±	0.0239
- mmlu_flan_cot_fewshot_stem	N/A	get - answer	exact_match	0.4866	±	0.0262