Superthoughts-lite-v2-MOE-Llama3.2-GGUF開源模型

首頁

Superthoughts Lite V2 MOE Llama3.2 GGUF

由Pinkstack開發

Superthoughts Lite v2是一個輕量級混合專家(MOE)模型，基於Llama-3.2架構，專注於推理任務，提供更高的準確性和性能。

大型語言模型支持多種語言#混合專家推理 #結構化思維生成 #多領域精準響應

下載量 119

發布時間 : 5/6/2025

模型概述

該模型是一個輕量級的推理模型，適用於聊天、數學、代碼和科學推理任務。它通過混合專家(MOE)架構實現高效推理，減少了響應生成時的循環現象。

模型特點

混合專家架構

包含4個專家模型（聊天、數學、代碼、科學推理），每次推理時激活2個專家，提高任務特定性能

高效推理

通過GRPO技術和Unsloth微調優化，提供更好的性能和更少的循環現象

結構化思考輸出

支持在<think>標籤中生成逐步推理過程，提高透明度和可解釋性

長上下文支持

支持131072標記的上下文長度，適合處理複雜任務

模型能力

文本生成

數學推理

代碼生成

科學推理

對話系統

使用案例

教育

數學問題解答

幫助學生解決複雜的數學問題，並展示逐步推理過程

提高學習效率和理解深度

編程學習輔助

解釋編程概念並生成示例代碼

幫助初學者更快掌握編程技能

研究

科學概念解釋

解釋複雜的科學概念和理論

輔助研究人員快速理解跨領域知識

🚀 Superthoughts Lite v2 MOE Llama3.2模型

Superthoughts Lite v2 MOE Llama3.2是一款強大的文本生成模型，基於Llama-3.2架構，通過多專家（MOE）技術訓練，在多種任務上表現出色，能為用戶提供準確且高效的文本生成服務。

🚀 快速開始

使用該模型時，你需要遵循特定的系統提示格式，以確保獲得最佳的輸出效果。以下是系統提示的格式：

Thinking: enabled.

Follow this format strictly:
<think>
Write your step-by-step reasoning here.
Break down the problem into smaller parts.
Solve each part systematically.
Check your work and verify the answer makes sense.
</think>
[Your final answer after thinking].

✨ 主要特性

多領域適用：適用於化學、代碼、數學、對話等多個領域，具有廣泛的應用場景。
多專家協同：模型包含4個專家，分別負責聊天推理、數學推理、代碼推理和科學推理，能夠在不同任務上提供更準確的結果。
高效生成：能夠生成多達16,380個令牌，上下文大小達到131072，可處理複雜的文本生成任務。
性能提升：相比前代模型Pinkstack/Superthoughts-lite-v1，在代碼生成和文本性能上有顯著提升。

📦 安裝指南

文檔未提供具體安裝步驟，暫無法展示。

💻 使用示例

基礎用法

文檔未提供基礎用法的代碼示例，暫無法展示。

高級用法

文檔未提供高級用法的代碼示例，暫無法展示。

📚 詳細文檔

模型信息

屬性	詳情
模型類型	文本生成模型
訓練數據	基於meta-llama/Llama-3.2-1B-Instruct，使用GRPO和SFT技術進行訓練
模型參數	3.91B參數，2個專家同時激活，共4個專家
生成能力	最多生成16,380個令牌，上下文大小為131072

模型訓練

模型的訓練過程分為兩個階段：

基礎模型訓練：首先為所有專家創建一個基礎模型，使用GRPO技術在meta-llama/Llama-3.2-1B-Instruct上進行微調。
專家訓練：使用SFT技術對每個潛在專家進行訓練，訓練完成後再次使用GRPO技術進行優化。

系統提示

使用該模型時，需要提供特定的系統提示，以引導模型生成思考過程和最終答案。系統提示格式如下：

Thinking: enabled.

Follow this format strictly:
<think>
Write your step-by-step reasoning here.
Break down the problem into smaller parts.
Solve each part systematically.
Check your work and verify the answer makes sense.
</think>
[Your final answer after thinking].

🔧 技術細節

模型通過多專家（MOE）技術，將不同的推理任務分配給不同的專家，從而提高模型在各個任務上的性能。在訓練過程中，使用了GRPO和SFT技術，以確保模型能夠學習到準確的推理過程。

📄 許可證

使用該模型需遵守LLAMA 3.2 COMMUNITY LICENSE。

⚠️ 重要提示

安全對齊有限：雖然模型進行了一定程度的安全對齊，但程度非常有限，模型有時可能會輸出未經審查的內容。
可能產生幻覺：所有大型語言模型（包括本模型）都可能產生幻覺並輸出虛假信息，使用時請務必仔細核對結果。
信息準確性：模型可能會根據自身認知編造信息，使用時請提供準確的信息。

GGUF模板

{{ if .Messages }}
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>
{{- if .System }}

{{ .System }}
{{- end }}
{{- if .Tools }}

You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original use question.
{{- end }}
{{- end }}<|eot_id|>
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{- if and $.Tools $last }}

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.

{{ $.Tools }}
{{- end }}

{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>

{{ end }}
{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
{{- if .ToolCalls }}

{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
{{- else }}

{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}
{{- end }}
{{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>

{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>

{{ end }}
{{- end }}
{{- end }}
{{- else }}
{{- if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}

superthoughts lite v2 moe logo