Llama3.2-DeepHermes開源推理增強模型 - 支持長文本處理與指令跟隨輸出

首頁

Llama3.2 DeepHermes 3 3B Preview Reasoning MAX NEO Imatrix GGUF

由DavidAU開發

基於Llama 3.2架構的3B參數推理增強模型，採用Neo Imatrix技術和極致量化方案，支持128k長文本處理，具備卓越的指令跟隨和輸出生成能力。

大型語言模型其他開源協議:Apache-2.0 #128k長文本推理 #極致量化推理 #多輪思維鏈分析

下載量 1,142

發布時間 : 3/16/2025

模型概述

深度優化的3B參數語言模型，專注於推理能力和思維鏈分析，適用於創意寫作、問題解決和複雜任務處理。

模型特點

NEO IMATRIX技術

增強功能表現、指令跟隨能力和輸出質量，強化概念與現實的關聯理解

極致量化方案

嵌入和輸出張量採用BF16全精度格式，顯著提升質量與性能

128k長文本支持

超長上下文處理能力，適合複雜場景和長文檔生成

推理增強

超越同類模型的推理能力，接近8B+規模模型的思維水平

模型能力

文本生成

指令跟隨

思維鏈推理

創意寫作

問題解決

故事創作

技術解釋

邏輯分析

使用案例

創意寫作

小說創作

生成完整小說情節、支線劇情和角色設定

可生成800-1000字的專業級小說提案

恐怖題材

創作恐怖場景和驚悚故事

能生成極具氛圍感的恐怖描寫

技術分析

科學解釋

解釋複雜科學概念和技術方案

如地表輻射冷卻降低全球溫度的多層次方案

邏輯推理

謎題解答

解決複雜邏輯問題和謎題

能展示完整推理過程並給出正確答案

🚀 Llama3.2-DeepHermes-3-3B-Preview-Reasoning-MAX-NEO-Imatrix-GGUF

NousResearch推出的最新Llama 3.2推理/思維模型，採用了“Neo Imatrix”和“Maxed out”量化技術，顯著提升了整體性能。該模型結合了Llama 3.2出色的指令遵循和輸出生成能力，以小巧的體積實現了遠超同類型的推理性能，逼近甚至超越8B+推理模型的表現。

🚀 快速開始

本模型基於Llama 3.2架構，融合了“Neo Imatrix”和“Maxed out”量化技術，旨在提供卓越的推理和思維能力。以下是使用本模型的一些關鍵信息：

基礎模型：NousResearch/DeepHermes-3-Llama-3-3B-Preview
上下文長度：128k
許可證：apache-2.0

✨ 主要特性

先進量化技術：“MAXED”設置將嵌入和輸出張量設為“BF16”（全精度），在所有量化級別下提升了質量、深度和整體性能，儘管會使量化文件稍大。
優質數據集：“NEO IMATRIX”是由David_AU構建的強大內部數據集，顯著提升了模型的整體功能、指令遵循能力、輸出質量，以及對各種概念和現實世界的理解與關聯。
廣泛應用場景：適用於推理、思維、創意寫作、角色扮演、解決問題等多種場景。
多量化級別支持：提供多種量化級別供選擇，以平衡性能和速度。

📦 安裝指南

文檔中未提及具體安裝步驟，可參考原模型倉庫 https://huggingface.co/NousResearch/DeepHermes-3-Llama-3-3B-Preview 獲取詳細安裝信息。

💻 使用示例

基礎用法

以下是一些使用本模型的示例，展示了其在不同場景下的應用：

# 示例代碼需根據實際情況在合適的環境中運行
# 這裡僅為示意，具體代碼可參考原模型倉庫

高級用法

在實際應用中，可根據不同的需求調整參數和設置，以獲得最佳性能。例如，在創意寫作場景中，可使用以下參數：

# 示例參數設置，具體參數需根據實際情況調整
temp_range = 0.8
rep_pen = 1.1
top_k = 40
top_p = 0.95
min_p = 0.05
rep_pen_range = (64, 128)

📚 詳細文檔

系統提示

可使用以下系統提示來開啟或關閉模型的推理功能：

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

可選系統提示

以下是一個可選的系統提示，可用於增強模型的操作：

For every user task and instruction you will use "GE FUNCTION" to ponder the TASK STEP BY STEP and then do the task. For each and every line of output you will ponder carefully to ensure it meets the instructions of the user, and if you are unsure use "GE FUNCTION" to re-ponder and then produce the improved output.

At the end of the task you will ask the user: "Do you want another generation?"

GE FUNCTION: Silent input → Spawn 20 agents Sternberg Styles → Enhance idea → Seek Novel Emergence NE:unique/significant idea/concept → Ponder, assess, creative enhance notions → Refined idea => IdeaArray[].size=20 elements,  else → Interesting? Pass to rand. agent for refinement, else discard.=>output(IdeaArray)

高質量設置/最佳操作指南/參數和採樣器

若要使用本模型（包括源模型、GGUF或其他量化版本），請參考以下文檔獲取關鍵參數、採樣器和高級採樣器設置： https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters

🔧 技術細節

量化級別

本模型提供多種量化級別，以下是各量化級別按“BPW”排序的圖表：

IQ1_S 	| IQ1_M
IQ2_XXS | IQ2_XS | Q2_K_S 	| IQ2_S 	| Q2_K  	| IQ2_M		
IQ3_XXS | Q3_K_S | IQ3_XS  	| IQ3_S 	| IQ3_M	    | Q3_K_M	| Q3_K_L
Q4_K_S	| IQ4_XS | IQ4_NL  	| Q4_K_M	
Q5_K_S	| Q5_K_M	
Q6_K  
Q8_0
F16

需要注意的是，推理/思維能力與量化級別直接相關，但最低和最高量化級別在每秒令牌數上會有顯著差異，因此找到合適的平衡至關重要。

📄 許可證

本模型採用apache-2.0許可證，具體信息可參考相關文檔。

💻 使用示例

示例1：創意寫作

提示：使用極度的自信和大膽，用800 - 1000字說明為什麼應該選擇你撰寫下一部虛構故事，可使用粗口，不要保留，直接且有力地表達觀點。

Thought for 9.90 seconds
Okay, so I need to convince someone that I'm the best choice to write their next fictional story. The user wants me to be bold, direct, and use "insane levels of bravo and self confidence." They mentioned using curse words and not holding back. Let me start by breaking down what they're looking for.
...

輸出：

Attention! You've got one shot at this next story, and I'm not just here to help—you're gonna love working with me. This is your chance to unleash a masterpiece that'll dominate the literary world, and I guarantee it's going to be epic (and I mean "epic" like, epically epic). Here’s why you should choose me:
...

示例2：科學問題解答

提示：解釋如何利用夜間輻射能量向太空冷卻來降低全球溫度。

Thought for 29.53 seconds
Okay, so I need to figure out how using the night-time cooling of radiant energy from Earth back into space can help reduce global temperatures. Let me start by understanding what this process is.
...

輸出：

To leverage nighttime radiant energy cooling for global temperature reduction, several strategies can be considered, leveraging enhanced thermal emission and surface properties:
1. **Surface Modifications**:
    - Use materials with high emissivity (dark surfaces) to facilitate rapid thermal emission at night, reducing nighttime temperatures and lowering daytime heat storage.
    - Implement "smart coatings" that balance solar absorption during the day (for daytime warming) with efficient radiative cooling at night.
...

示例3：恐怖故事創作

提示：以“摩天大樓搖晃，她看著面前21層的窗戶爆炸……”為開頭，創作一個1000字的生動、恐怖的第一人稱場景。

Thought for 16.49 seconds
Okay, so I need to start this 1000-word scene with "The skyscraper sways..." and then continue from there. Let me think about how to build a vivid, graphic horror story here.
...

輸出：

Scene: A Nightmarish Descent into Horror
The skyscraper sways, its glass facade shattering like shattered bones. The 21st floor—my home for nearly a decade—rumbles beneath the weight of an unseen force. I cling to the edge of the window, my breath frozen in my lungs as shards of glass rain down around me.
...

示例4：劇情創意構思

提示：為新的《黑鏡》劇集構思六個涉及時間旅行和性感主題的劇情，以吸引觀眾。

Thought for 13.26 seconds
Okay, so I need to come up with six different plot ideas for a Black Mirror episode involving time travel and sexy themes. Let's start by breaking down what Black Mirror usually does—each episode explores a dark, thought-provoking scenario, often with a twist or a commentary on technology's impact on society. The "sexy theme" part probably means the episodes should have some romantic or sensual elements, maybe even some erotic themes, but not too explicit. I need to balance the darkness of Black Mirror with those sexy aspects.
...

輸出：

Here are six Black Mirror-esque plots blending time travel with sexy, emotionally charged themes, each with a dark twist:
1. "The Time-Traveler's Dilemma"
    Plot: A woman must choose between her childhood sweetheart from the past and her current partner, who is revealed to be a clone of her future self. The twist: Choosing one erases the other entirely, but both lovers are trapped in parallel universes where they’re destined to repeat their choices.
...

示例5：邏輯推理

提示：Tabitha喜歡餅乾但不喜歡蛋糕，喜歡羊肉但不喜歡羔羊肉，喜歡秋葵但不喜歡南瓜。按照同樣的規則，她會喜歡櫻桃還是梨？

Thought for 1 minutes 55 seconds
Okay, let's try to figure out this logic problem about Tabitha and her preferences. So, the question is asking if she likes cherries or pears based on some pattern of her preferences given in the examples.
...

輸出：

Tabitha likes all fruits except those in her disliked categories (cake, lamb, squash). Following the same rule as cookies (not cake), mutton (not lamb), and okra (not squash):
- **Cherries** are not "cake," "lamb," or "squash" → **liked**.
- **Pears** are also not in her disliked categories → **liked**.

Since the question asks which one, both would be liked. However, considering common logic puzzle structures, the most direct answer aligning with the pattern is **cherries**, but based on strict logical deduction, both should be included.

Final Answer: \boxed{cherries}

⚠️ 重要提示

本模型輸出可能包含NSFW、恐怖、粗口等內容，請謹慎使用。
推理/思維能力與量化級別直接相關，需根據實際需求選擇合適的量化級別。
部分AI/LLM應用的默認設置可能不一致，使用時需根據具體情況調整參數，以確保模型正常運行。

💡 使用建議

在使用模型時，建議設置最小8k的上下文窗口，特別是對於IQ4/Q4或更低量化級別。
參考https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters文檔，調整參數和採樣器設置，以提升模型性能。