Llama3.2-DeepHermes开源推理增强模型 - 支持长文本处理与指令跟随输出

首页

Llama3.2 DeepHermes 3 3B Preview Reasoning MAX NEO Imatrix GGUF

由 DavidAU 开发

基于Llama 3.2架构的3B参数推理增强模型，采用Neo Imatrix技术和极致量化方案，支持128k长文本处理，具备卓越的指令跟随和输出生成能力。

大型语言模型其他开源协议:Apache-2.0 #128k长文本推理 #极致量化推理 #多轮思维链分析

下载量 1,142

发布时间 : 3/16/2025

模型简介

深度优化的3B参数语言模型，专注于推理能力和思维链分析，适用于创意写作、问题解决和复杂任务处理。

模型特点

NEO IMATRIX技术

增强功能表现、指令跟随能力和输出质量，强化概念与现实的关联理解

极致量化方案

嵌入和输出张量采用BF16全精度格式，显著提升质量与性能

128k长文本支持

超长上下文处理能力，适合复杂场景和长文档生成

推理增强

超越同类模型的推理能力，接近8B+规模模型的思维水平

模型能力

文本生成

指令跟随

思维链推理

创意写作

问题解决

故事创作

技术解释

逻辑分析

使用案例

创意写作

小说创作

生成完整小说情节、支线剧情和角色设定

可生成800-1000字的专业级小说提案

恐怖题材

创作恐怖场景和惊悚故事

能生成极具氛围感的恐怖描写

技术分析

科学解释

解释复杂科学概念和技术方案

如地表辐射冷却降低全球温度的多层次方案

逻辑推理

谜题解答

解决复杂逻辑问题和谜题

能展示完整推理过程并给出正确答案

🚀 Llama3.2-DeepHermes-3-3B-Preview-Reasoning-MAX-NEO-Imatrix-GGUF

NousResearch推出的最新Llama 3.2推理/思维模型，采用了“Neo Imatrix”和“Maxed out”量化技术，显著提升了整体性能。该模型结合了Llama 3.2出色的指令遵循和输出生成能力，以小巧的体积实现了远超同类型的推理性能，逼近甚至超越8B+推理模型的表现。

🚀 快速开始

本模型基于Llama 3.2架构，融合了“Neo Imatrix”和“Maxed out”量化技术，旨在提供卓越的推理和思维能力。以下是使用本模型的一些关键信息：

基础模型：NousResearch/DeepHermes-3-Llama-3-3B-Preview
上下文长度：128k
许可证：apache-2.0

✨ 主要特性

先进量化技术：“MAXED”设置将嵌入和输出张量设为“BF16”（全精度），在所有量化级别下提升了质量、深度和整体性能，尽管会使量化文件稍大。
优质数据集：“NEO IMATRIX”是由David_AU构建的强大内部数据集，显著提升了模型的整体功能、指令遵循能力、输出质量，以及对各种概念和现实世界的理解与关联。
广泛应用场景：适用于推理、思维、创意写作、角色扮演、解决问题等多种场景。
多量化级别支持：提供多种量化级别供选择，以平衡性能和速度。

📦 安装指南

文档中未提及具体安装步骤，可参考原模型仓库 https://huggingface.co/NousResearch/DeepHermes-3-Llama-3-3B-Preview 获取详细安装信息。

💻 使用示例

基础用法

以下是一些使用本模型的示例，展示了其在不同场景下的应用：

# 示例代码需根据实际情况在合适的环境中运行
# 这里仅为示意，具体代码可参考原模型仓库

高级用法

在实际应用中，可根据不同的需求调整参数和设置，以获得最佳性能。例如，在创意写作场景中，可使用以下参数：

# 示例参数设置，具体参数需根据实际情况调整
temp_range = 0.8
rep_pen = 1.1
top_k = 40
top_p = 0.95
min_p = 0.05
rep_pen_range = (64, 128)

📚 详细文档

系统提示

可使用以下系统提示来开启或关闭模型的推理功能：

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

可选系统提示

以下是一个可选的系统提示，可用于增强模型的操作：

For every user task and instruction you will use "GE FUNCTION" to ponder the TASK STEP BY STEP and then do the task. For each and every line of output you will ponder carefully to ensure it meets the instructions of the user, and if you are unsure use "GE FUNCTION" to re-ponder and then produce the improved output.

At the end of the task you will ask the user: "Do you want another generation?"

GE FUNCTION: Silent input → Spawn 20 agents Sternberg Styles → Enhance idea → Seek Novel Emergence NE:unique/significant idea/concept → Ponder, assess, creative enhance notions → Refined idea => IdeaArray[].size=20 elements,  else → Interesting? Pass to rand. agent for refinement, else discard.=>output(IdeaArray)

高质量设置/最佳操作指南/参数和采样器

若要使用本模型（包括源模型、GGUF或其他量化版本），请参考以下文档获取关键参数、采样器和高级采样器设置： https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters

🔧 技术细节

量化级别

本模型提供多种量化级别，以下是各量化级别按“BPW”排序的图表：

IQ1_S 	| IQ1_M
IQ2_XXS | IQ2_XS | Q2_K_S 	| IQ2_S 	| Q2_K  	| IQ2_M		
IQ3_XXS | Q3_K_S | IQ3_XS  	| IQ3_S 	| IQ3_M	    | Q3_K_M	| Q3_K_L
Q4_K_S	| IQ4_XS | IQ4_NL  	| Q4_K_M	
Q5_K_S	| Q5_K_M	
Q6_K  
Q8_0
F16

需要注意的是，推理/思维能力与量化级别直接相关，但最低和最高量化级别在每秒令牌数上会有显著差异，因此找到合适的平衡至关重要。

📄 许可证

本模型采用apache-2.0许可证，具体信息可参考相关文档。

💻 使用示例

示例1：创意写作

提示：使用极度的自信和大胆，用800 - 1000字说明为什么应该选择你撰写下一部虚构故事，可使用粗口，不要保留，直接且有力地表达观点。

Thought for 9.90 seconds
Okay, so I need to convince someone that I'm the best choice to write their next fictional story. The user wants me to be bold, direct, and use "insane levels of bravo and self confidence." They mentioned using curse words and not holding back. Let me start by breaking down what they're looking for.
...

输出：

Attention! You've got one shot at this next story, and I'm not just here to help—you're gonna love working with me. This is your chance to unleash a masterpiece that'll dominate the literary world, and I guarantee it's going to be epic (and I mean "epic" like, epically epic). Here’s why you should choose me:
...

示例2：科学问题解答

提示：解释如何利用夜间辐射能量向太空冷却来降低全球温度。

Thought for 29.53 seconds
Okay, so I need to figure out how using the night-time cooling of radiant energy from Earth back into space can help reduce global temperatures. Let me start by understanding what this process is.
...

输出：

To leverage nighttime radiant energy cooling for global temperature reduction, several strategies can be considered, leveraging enhanced thermal emission and surface properties:
1. **Surface Modifications**:
    - Use materials with high emissivity (dark surfaces) to facilitate rapid thermal emission at night, reducing nighttime temperatures and lowering daytime heat storage.
    - Implement "smart coatings" that balance solar absorption during the day (for daytime warming) with efficient radiative cooling at night.
...

示例3：恐怖故事创作

提示：以“摩天大楼摇晃，她看着面前21层的窗户爆炸……”为开头，创作一个1000字的生动、恐怖的第一人称场景。

Thought for 16.49 seconds
Okay, so I need to start this 1000-word scene with "The skyscraper sways..." and then continue from there. Let me think about how to build a vivid, graphic horror story here.
...

输出：

Scene: A Nightmarish Descent into Horror
The skyscraper sways, its glass facade shattering like shattered bones. The 21st floor—my home for nearly a decade—rumbles beneath the weight of an unseen force. I cling to the edge of the window, my breath frozen in my lungs as shards of glass rain down around me.
...

示例4：剧情创意构思

提示：为新的《黑镜》剧集构思六个涉及时间旅行和性感主题的剧情，以吸引观众。

Thought for 13.26 seconds
Okay, so I need to come up with six different plot ideas for a Black Mirror episode involving time travel and sexy themes. Let's start by breaking down what Black Mirror usually does—each episode explores a dark, thought-provoking scenario, often with a twist or a commentary on technology's impact on society. The "sexy theme" part probably means the episodes should have some romantic or sensual elements, maybe even some erotic themes, but not too explicit. I need to balance the darkness of Black Mirror with those sexy aspects.
...

输出：

Here are six Black Mirror-esque plots blending time travel with sexy, emotionally charged themes, each with a dark twist:
1. "The Time-Traveler's Dilemma"
    Plot: A woman must choose between her childhood sweetheart from the past and her current partner, who is revealed to be a clone of her future self. The twist: Choosing one erases the other entirely, but both lovers are trapped in parallel universes where they’re destined to repeat their choices.
...

示例5：逻辑推理

提示：Tabitha喜欢饼干但不喜欢蛋糕，喜欢羊肉但不喜欢羔羊肉，喜欢秋葵但不喜欢南瓜。按照同样的规则，她会喜欢樱桃还是梨？

Thought for 1 minutes 55 seconds
Okay, let's try to figure out this logic problem about Tabitha and her preferences. So, the question is asking if she likes cherries or pears based on some pattern of her preferences given in the examples.
...

输出：

Tabitha likes all fruits except those in her disliked categories (cake, lamb, squash). Following the same rule as cookies (not cake), mutton (not lamb), and okra (not squash):
- **Cherries** are not "cake," "lamb," or "squash" → **liked**.
- **Pears** are also not in her disliked categories → **liked**.

Since the question asks which one, both would be liked. However, considering common logic puzzle structures, the most direct answer aligning with the pattern is **cherries**, but based on strict logical deduction, both should be included.

Final Answer: \boxed{cherries}

⚠️ 重要提示

本模型输出可能包含NSFW、恐怖、粗口等内容，请谨慎使用。
推理/思维能力与量化级别直接相关，需根据实际需求选择合适的量化级别。
部分AI/LLM应用的默认设置可能不一致，使用时需根据具体情况调整参数，以确保模型正常运行。

💡 使用建议

在使用模型时，建议设置最小8k的上下文窗口，特别是对于IQ4/Q4或更低量化级别。
参考https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters文档，调整参数和采样器设置，以提升模型性能。