DeepSeek-R1-0528-Qwen3-8B-GGUF開源模型 - 免費部署，增強推理能力，數學編程不在話下！

首頁

Deepseek R1 0528 Qwen3 8B GGUF

由Sci-fi-vy開發

DeepSeek-R1-0528是DeepSeek R1系列的小版本升級模型，通過增加計算資源和算法優化顯著提升了推理深度和能力，在數學、編程等多個基準測試中表現出色。

大型語言模型

Transformers

開源協議:MIT #深度推理優化 #數學編程增強 #低幻覺率

下載量 1,202

發布時間 : 6/3/2025

模型概述

DeepSeek-R1-0528是一個大型語言模型，專注於提升推理能力和處理複雜任務的能力，適用於數學、編程和通用邏輯任務。

模型特點

推理能力顯著提升

通過增加計算資源和算法優化，顯著提升了推理深度和能力，在數學、編程等多個基準測試中表現出色。

處理複雜推理任務能力增強

與上一版本相比，在處理複雜推理任務方面有顯著改進，例如在AIME 2025測試中準確率從70%提高到87.5%。

幻覺率降低

該版本降低了幻覺率，提高了回答的準確性。

函數調用支持增強

提供了更好的函數調用支持，增強了模型的實用性。

編碼體驗優化

帶來了更好的氛圍編碼體驗，提升了編程輔助的效果。

模型能力

文本生成

複雜推理

數學問題求解

編程輔助

函數調用

文件上傳處理

網頁搜索整合

使用案例

教育

數學問題求解

解決複雜的數學問題，如AIME和HMMT競賽題。

在AIME 2025測試中準確率達到87.5%。

編程

編程輔助

幫助開發者編寫和調試代碼。

在LiveCodeBench測試中Pass@1達到73.3%。

通用問答

複雜推理問答

回答需要深度推理的複雜問題。

在GPQA-Diamond測試中Pass@1達到81.0%。

🚀 DeepSeek-R1-0528模型卡片

DeepSeek-R1-0528模型是DeepSeek R1系列的一次小版本升級，通過增加計算資源和算法優化，顯著提升了推理深度和能力，在數學、編程等多個基準測試中表現出色，整體性能接近領先模型。

🚀 快速開始

你可以在DeepSeek官方網站與DeepSeek-R1進行對話：chat.deepseek.com，並開啟“DeepThink”按鈕。同時，我們也在DeepSeek平臺提供了兼容OpenAI的API：platform.deepseek.com。

若要在本地運行DeepSeek-R1-0528，請訪問DeepSeek-R1倉庫獲取更多信息。與DeepSeek-R1的先前版本相比，DeepSeek-R1-0528的使用建議有以下變化：

現在支持系統提示。
無需在輸出開頭添加“\n”來強制模型進入思考模式。

系統提示

在DeepSeek官方網站/應用中，我們使用帶有特定日期的相同系統提示：

該助手為DeepSeek-R1，由深度求索公司創造。
今天是{current date}。

例如：

該助手為DeepSeek-R1，由深度求索公司創造。
今天是2025年5月28日，星期一。

溫度參數

在我們的網頁和應用環境中，溫度參數 $T_{model}$ 設置為0.6。

文件上傳和網頁搜索提示

對於文件上傳，請按照以下模板創建提示，其中 {file_name}、{file_content} 和 {question} 是參數：

file_template = \
"""[file name]: {file_name}
[file content begin]
{file_content}
[file content end]
{question}"""

對於網頁搜索，{search_results}、{cur_date} 和 {question} 是參數。對於中文查詢，我們使用以下提示：

search_answer_zh_template = \
'''# 以下內容是基於用戶發送的消息的搜索結果:
{search_results}
在我給你的搜索結果中，每個結果都是[webpage X begin]...[webpage X end]格式的，X代表每篇文章的數字索引。請在適當的情況下在句子末尾引用上下文。請按照引用編號[citation:X]的格式在答案中對應部分引用上下文。如果一句話源自多個上下文，請列出所有相關的引用編號，例如[citation:3][citation:5]，切記不要將引用集中在最後返回引用編號，而是在答案對應部分列出。
在回答時，請注意以下幾點：
- 今天是{cur_date}。
- 並非搜索結果的所有內容都與用戶的問題密切相關，你需要結合問題，對搜索結果進行甄別、篩選。
- 對於列舉類的問題（如列舉所有航班信息），儘量將答案控制在10個要點以內，並告訴用戶可以查看搜索來源、獲得完整信息。優先提供信息完整、最相關的列舉項；如非必要，不要主動告訴用戶搜索結果未提供的內容。
- 對於創作類的問題（如寫論文），請務必在正文的段落中引用對應的參考編號，例如[citation:3][citation:5]，不能只在文章末尾引用。你需要解讀並概括用戶的題目要求，選擇合適的格式，充分利用搜索結果並抽取重要信息，生成符合用戶要求、極具思想深度、富有創造力與專業性的答案。你的創作篇幅需要儘可能延長，對於每一個要點的論述要推測用戶的意圖，給出儘可能多角度的回答要點，且務必信息量大、論述詳盡。
- 如果回答很長，請儘量結構化、分段落總結。如果需要分點作答，儘量控制在5個點以內，併合並相關的內容。
- 對於客觀類的問答，如果問題的答案非常簡短，可以適當補充一到兩句相關信息，以豐富內容。
- 你需要根據用戶要求和回答內容選擇合適、美觀的回答格式，確保可讀性強。
- 你的回答應該綜合多個相關網頁來回答，不能重複引用一個網頁。
- 除非用戶要求，否則你回答的語言需要和用戶提問的語言保持一致。
# 用戶消息為：
{question}'''

對於英文查詢，我們使用以下提示：

search_answer_en_template = \
'''# The following contents are the search results related to the user's message:
{search_results}
In the search results I provide to you, each result is formatted as [webpage X begin]...[webpage X end], where X represents the numerical index of each article. Please cite the context at the end of the relevant sentence when appropriate. Use the citation format [citation:X] in the corresponding part of your answer. If a sentence is derived from multiple contexts, list all relevant citation numbers, such as [citation:3][citation:5]. Be sure not to cluster all citations at the end; instead, include them in the corresponding parts of the answer.
When responding, please keep the following points in mind:
- Today is {cur_date}.
- Not all content in the search results is closely related to the user's question. You need to evaluate and filter the search results based on the question.
- For listing-type questions (e.g., listing all flight information), try to limit the answer to 10 key points and inform the user that they can refer to the search sources for complete information. Prioritize providing the most complete and relevant items in the list. Avoid mentioning content not provided in the search results unless necessary.
- For creative tasks (e.g., writing an essay), ensure that references are cited within the body of the text, such as [citation:3][citation:5], rather than only at the end of the text. You need to interpret and summarize the user's requirements, choose an appropriate format, fully utilize the search results, extract key information, and generate an answer that is insightful, creative, and professional. Extend the length of your response as much as possible, addressing each point in detail and from multiple perspectives, ensuring the content is rich and thorough.
- If the response is lengthy, structure it well and summarize it in paragraphs. If a point-by-point format is needed, try to limit it to 5 points and merge related content.
- For objective Q&A, if the answer is very brief, you may add one or two related sentences to enrich the content.
- Choose an appropriate and visually appealing format for your response based on the user's requirements and the content of the answer, ensuring strong readability.
- Your answer should synthesize information from multiple relevant webpages and avoid repeatedly citing the same webpage.
- Unless the user requests otherwise, your response should be in the same language as the user's question.
# The user's message is:
{question}'''

✨ 主要特性

推理能力顯著提升：通過增加計算資源和引入算法優化機制，在最新更新中，DeepSeek R1大幅提高了推理深度和推理能力，在數學、編程和通用邏輯等各種基準評估中表現出色，整體性能接近領先模型，如O3和Gemini 2.5 Pro。
處理複雜推理任務能力增強：與上一版本相比，升級後的模型在處理複雜推理任務方面有顯著改進。例如，在AIME 2025測試中，模型的準確率從之前版本的70%提高到了當前版本的87.5%。這一進步源於推理過程中思維深度的增強：在AIME測試集中，之前的模型每個問題平均使用12K個標記，而新版本每個問題平均使用23K個標記。
幻覺率降低：該版本還降低了幻覺率。
函數調用支持增強：提供了更好的函數調用支持。
編碼體驗優化：帶來了更好的氛圍編碼體驗。

📦 安裝指南

請訪問DeepSeek-R1倉庫獲取在本地運行DeepSeek-R1-0528的更多信息。

📚 詳細文檔

評估結果

DeepSeek-R1-0528

對於我們所有的模型，最大生成長度設置為64K標記。對於需要採樣的基準測試，我們使用溫度為 $0.6$，top-p值為 $0.95$，併為每個查詢生成16個響應以估計pass@1。

類別	基準測試（指標）	DeepSeek R1	DeepSeek R1 0528
通用	MMLU-Redux (EM)	92.9	93.4
通用	MMLU-Pro (EM)	84.0	85.0
通用	GPQA-Diamond (Pass@1)	71.5	81.0
通用	SimpleQA (Correct)	30.1	27.8
通用	FRAMES (Acc.)	82.5	83.0
通用	Humanity's Last Exam (Pass@1)	8.5	17.7
代碼	LiveCodeBench (2408 - 2505) (Pass@1)	63.5	73.3
代碼	Codeforces-Div1 (Rating)	1530	1930
代碼	SWE Verified (Resolved)	49.2	57.6
代碼	Aider-Polyglot (Acc.)	53.3	71.6
數學	AIME 2024 (Pass@1)	79.8	91.4
數學	AIME 2025 (Pass@1)	70.0	87.5
數學	HMMT 2025 (Pass@1)	41.7	79.4
數學	CNMO 2024 (Pass@1)	78.8	86.9
工具	BFCL_v3_MultiTurn (Acc)	-	37.0
工具	Tau-Bench (Pass@1)	-	53.5(Airline)/63.9(Retail)

注：我們使用無代理框架評估SWE-Verified上的模型性能。我們僅評估HLE測試集中的純文本提示。在Tau-bench評估中，使用GPT-4.1扮演用戶角色。

DeepSeek-R1-0528-Qwen3-8B

同時，我們從DeepSeek-R1-0528中提煉思維鏈對Qwen3 8B Base進行後訓練，得到了DeepSeek-R1-0528-Qwen3-8B。該模型在AIME 2024上的表現達到了開源模型中的最優水平，比Qwen3 8B高出10.0%，與Qwen3-235B-thinking的性能相當。

模型	AIME 24	AIME 25	HMMT Feb 25	GPQA Diamond	LiveCodeBench (2408 - 2505)
Qwen3-235B-A22B	85.7	81.5	62.5	71.1	66.5
Qwen3-32B	81.4	72.9	-	68.4	-
Qwen3-8B	76.0	67.3	-	62.0	-
Phi-4-Reasoning-Plus-14B	81.3	78.0	53.6	69.3	-
Gemini-2.5-Flash-Thinking-0520	82.3	72.0	64.2	82.8	62.3
o3-mini (medium)	79.6	76.7	53.3	76.8	65.9
DeepSeek-R1-0528-Qwen3-8B	86.0	76.3	61.5	61.1	60.5

🔧 技術細節

DeepSeek R1模型進行了小版本升級，當前版本為DeepSeek-R1-0528。在最新更新中，DeepSeek R1通過增加計算資源和在後期訓練中引入算法優化機制，顯著提高了推理深度和推理能力。

📄 許可證

此代碼倉庫遵循MIT許可證。DeepSeek-R1模型的使用也遵循MIT許可證。DeepSeek-R1系列（包括基礎版和對話版）支持商業使用和蒸餾。

📚 引用

@misc{deepseekai2025deepseekr1incentivizingreasoningcapability,
      title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning}, 
      author={DeepSeek-AI},
      year={2025},
      eprint={2501.12948},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.12948}, 
}