Mistral-Small-3.1-24B-Instruct-2503-MAX-NEO-Imatrix-GGUF開源模型

首頁

Mistral Small 3.1 24B Instruct 2503 MAX NEO Imatrix GGUF

由DavidAU開發

Mistralai推出的24B參數指令微調模型，支持128k上下文長度和多語言處理，採用Neo Imatrix技術和MAX量化方案提升性能

大型語言模型支持多種語言開源協議:Apache-2.0 #128k超長上下文 #創意寫作增強 #多語言指令跟隨

下載量 38.29k

發布時間 : 3/18/2025

模型概述

基於Mistral架構的大語言模型，專注於指令跟隨和文本生成任務，適用於多種語言和場景

模型特點

128k超長上下文

支持長達128k token的上下文記憶，適合處理長文檔和複雜對話

Neo Imatrix技術

通過專有內部矩陣數據集增強模型理解力和輸出質量

MAX量化方案

嵌入層和輸出張量保留BF16精度，在量化版本中保持最佳生成質量

多語言支持

支持24種語言的文本生成和理解

模型能力

文本生成

指令跟隨

多語言處理

創意寫作

技術文檔撰寫

故事續寫

使用案例

創意寫作

小說續寫

基於給定開頭續寫完整故事

生成具有連貫情節和豐富細節的敘事文本

劇本構思

生成《黑鏡》風格的科幻劇集大綱

產出包含完整情節轉折和角色設定的劇集方案

技術應用

氣候解決方案

提出利用夜間輻射降溫緩解全球變暖的技術方案

生成包含科學原理和實施步驟的詳細報告

🚀 Mistral-Small-3.1-24B-Instruct-2503-MAX-NEO-Imatrix-GGUF

Mistral-Small-3.1-24B-Instruct-2503-MAX-NEO-Imatrix-GGUF 是 Mistralai 推出的一款文本生成模型。它採用了 “Neo Imatrix” 和 “Maxed out” 量化技術，提升了整體性能。該模型大多未經過審查，還附帶了 3 種不同的系統提示，以幫助模型更好地發揮作用。

✨ 主要特性

多語言支持：支持英語、法語、德語、西班牙語、葡萄牙語、意大利語、日語、韓語、俄語、中文、阿拉伯語、波斯語、印尼語、馬來語、尼泊爾語、波蘭語、羅馬尼亞語、塞爾維亞語、瑞典語、土耳其語、烏克蘭語、越南語、印地語和孟加拉語等多種語言。
高上下文長度：支持 128k 的上下文長度。
多種量化選擇：提供多種量化方式，如 IQ1_S、IQ1_M、IQ2_XXS 等，不同量化方式在性能和效果上有所差異。
多種系統提示：包含 3 種不同的系統提示，可用於不同的使用場景，如創意寫作等。

📦 安裝指南

文檔未提供具體安裝步驟，可參考模型原倉庫 https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503 進行安裝。

💻 使用示例

基礎用法

以下是使用該模型進行文本生成的示例，示例使用 IQ4XS 量化，最小參數和標準模板，溫度範圍為 0.8，重複懲罰為 1.1，TopK 為 40，topP 為 0.95，minP 為 0.05。

# 示例 1：恐怖小說場景創作
使用以下 “故事創意”，撰寫一部恐怖小說的第一個場景，介紹這位年輕女子。該場景應從動作中間開始，包含對話、生動的段落，並以與故事創意相關但意外的懸念結尾。場景應長達 1000 字，衝突和懸念逐步升級，以第一人稱、現在時態撰寫，視角人物為年輕女子。視角人物會咒罵和發誓，嘴巴通常 “不乾淨”。

故事創意：在一個由獨裁統治的世界裡，一位叛逆的年輕女子領導了一場反對該體制的叛亂。儘管面臨風險，她仍奮力推翻獨裁者，為自己的國家恢復民主。政府以叛國罪處決了她，但她堅持自己的信念，引發了這場革命。

(temp 2, 使用創意系統提示 (#3))

輸出：
The stench of burning rubber fills my nostrils as I tear through the night, the old, beat - up sedan groaning under the strain. The city lights blaze like false stars, casting eerie shadows that dance macabrely across the dilapidated buildings. My heart pounds in my chest like a kick drum, echoing the roar of the engine. Fucking dictatorship, making us live like rats in a maze.
...

高級用法

在創意寫作場景中，推薦使用 IQ3s / IQ4XS / IQ4NL / Q4s 量化以獲得最佳效果，並使用 “System Prompt #3” 以及溫度高於 1 的設置，以充分發揮模型的創意能力。

# 示例 3：為《黑鏡》新劇集創作包含時間旅行和性感主題的情節
Come up with six plots for a new "Black Mirror" episode (that the audience would love) that all involve time travel with sexy theme(s).

(temp 2, creative system prompt (#3))

輸出：
Here are six plot concepts for a new "Black Mirror" episode involving time travel with a sexy theme, designed to evoke strong emotional responses and immersive experiences:
1. **Title: "Time's Desire"**
Plot: In 2075, an experimental app allows users to communicate with their past selves, aiming to provide closure or inspire positive change. However, the app secretly pairs users with strangers who share similar life patterns but different outcomes, leading to unexpected romantic entanglements and dangerous temporal echoes. As one user delves deeper into her connection with a stranger from the past, she uncovers a dark conspiracy where emotions are harvested to fuel the time - travel technology.
...

📚 詳細文檔

量化說明

“MAXED”：所有量化的嵌入和輸出張量都設置為 “BF16”（全精度），這在一定程度上增加了量化大小，但提高了質量、深度和整體性能。
“NEO IMATRIX”：由 David_AU 構建的強大內部 imatrix 數據集，可提升整體功能、指令遵循能力、輸出質量以及與思想、概念和現實世界的聯繫。與 “MAXing” 量化相結合，可進一步提高性能。

以下是不同量化方式按 “BPW” 排序的圖表，從低到高依次為：

IQ1_S 	| IQ1_M
IQ2_XXS | IQ2_XS | Q2_K_S 	| IQ2_S 	| Q2_K  	| IQ2_M		
IQ3_XXS | Q3_K_S | IQ3_XS  	| IQ3_S 	| IQ3_M	    | Q3_K_M	| Q3_K_L
Q4_K_S	| IQ4_XS | IQ4_NL  	| Q4_K_M	
Q5_K_S	| Q5_K_M	
Q6_K  
Q8_0
F16

推薦創意用例使用 IQ3s / IQ4XS / IQ4NL / Q4s 量化，IQ4XS/IQ4NL 量化會產生與其他 “Q” 和 “IQ” 量化不同的輸出，適合創意用途。對於一般用途，推薦使用 q5s/q6/q8 量化；對於便攜式設備（如手機），推薦使用 Q4_0/Q5_0 量化。需要注意的是，IQ1s 性能不佳，IQ2s 尚可，推薦使用 IQ3s 及以上量化。

系統提示說明

系統提示 #1 - 通用改進：可用於增強模型的操作。需精確複製粘貼，包括換行符。可調整其中的 “20” 來增減提示的強度，也可刪除 “At the end of the task you will ask the user: "Do you want another generation?"” 這一行。

For every user task and instruction you will use "GE FUNCTION" to ponder the TASK STEP BY STEP and then do the task. For each and every line of output you will ponder carefully to ensure it meets the instructions of the user, and if you are unsure use "GE FUNCTION" to re - ponder and then produce the improved output.

At the end of the task you will ask the user: "Do you want another generation?"

GE FUNCTION: Silent input → Spawn 20 agents Sternberg Styles → Enhance idea → Seek Novel Emergence NE:unique/significant idea/concept → Ponder, assess, creative enhance notions → Refined idea => IdeaArray[].size = 20 elements,  else → Interesting? Pass to rand. agent for refinement, else discard.=>output(IdeaArray)

系統提示 #2 - 思考/推理：用於激發模型的思考和推理能力。雖該模型本身並非專門的思考推理模型，但使用此提示可改善生成效果，並提供對生成過程及改進方法的洞察。

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

系統提示 #3 - 創意增強：可顯著提升模型的創意能力，建議搭配溫度 1 或更高的設置使用。同樣需精確複製粘貼，包括換行符。

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

最高質量設置/最佳操作指南/參數和採樣器

如果要使用該模型（源模型、GGUF 或其他量化版本），請查看文檔 https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters，其中包含關鍵參數、採樣器和高級採樣器設置（適用於多個 AI/LLM 應用程序），還會鏈接到 “推理模型” 的技巧和竅門部分。

無論模型屬於何種 “類別”，該文檔都會詳細介紹增強操作的方法。對於 3/4 類模型，必須正確設置 “用例” 的默認設置（參數、採樣器、高級採樣器），因為一些 AI/LLM 應用程序的默認設置不一致，可能導致模型性能不佳。此外，該文檔中介紹的參數、採樣器和高級採樣器設置可用於任何模型、任何倉庫、任何量化版本（包括源代碼操作），以增強模型的操作性能。

🔧 技術細節

量化技術

模型採用了 “MAXED” 和 “NEO IMATRIX” 量化技術。“MAXED” 技術將嵌入和輸出張量設置為 “BF16” 全精度，在提升性能的同時，量化大小會稍有增加。“NEO IMATRIX” 是一個內部構建的強大數據集，能提升模型的整體功能、指令遵循能力和輸出質量。

不同量化方式的性能差異

不同的量化方式（如 IQ1_S、IQ2_XXS 等）在性能和效果上有所不同。IQ4XS/IQ4NL 量化會產生與其他 “Q” 和 “IQ” 量化不同的輸出，推薦用於創意用途；IQ4s（首選）/Q4s（次選）及更低量化在某些方面表現較強；IQ1s 性能較差，IQ2s 表現一般，推薦使用 IQ3s 及以上量化。