CodeV-R1-Distill-Qwen-7B開源模型 - 高效生成Verilog RTL代碼，基準測試表現佳

首頁

Codev R1 Distill Qwen 7B

由zhuyaoyu開發

基於DeepSeek-R1蒸餾的Verilog RTL代碼生成模型，在Verilog基準測試中表現優異

大型語言模型

Transformers

#Verilog代碼生成 #硬件設計推理 #知識蒸餾優化

下載量 154

發布時間 : 3/22/2025

模型概述

該模型是從DeepSeek-R1蒸餾而來的Verilog專用模型，專注於硬件描述語言(HDL)的代碼生成和問題解決，在VerilogEval和RTLLM基準測試中超越同類模型，同時提升了數學推理能力

模型特點

卓越的Verilog生成能力

在VerilogEval和RTLLM基準測試中超越GPT-4等通用大模型

知識蒸餾技術

從DeepSeek-R1蒸餾獲得類似推理能力

跨領域能力提升

Verilog訓練意外提升了數學推理能力

高質量數據篩選

通過嚴格過濾保留87,000個高質量(問題，代碼)對

模型能力

Verilog代碼生成

硬件設計問題解決

數學推理

代碼補全

規範到RTL翻譯

使用案例

芯片設計

RTL代碼生成

根據功能規範自動生成寄存器傳輸級代碼

在VerilogEval規範到RTL任務中達到65.4%準確率

代碼補全

輔助硬件工程師完成部分Verilog代碼

在VerilogEval補全任務中達到65.1%準確率

硬件驗證

測試用例生成

為硬件驗證生成測試場景

🚀 CodeV-R1-Distill-Qwen-7B

CodeV-R1-Distill-Qwen-7B 是從 DeepSeek-R1 中使用 CodeV 數據集提煉出來的模型。該模型在主要的 Verilog 基準測試中優於先前的非推理大語言模型，展示了卓越的代碼合成和問題解決能力。此外，提煉 Verilog 代碼還增強了模型的數學推理能力，表明以硬件為中心的訓練與一般邏輯推理之間存在更廣泛的協同作用。

🚀 快速開始

CodeV-R1-Distill-Qwen-7B 可以像 Qwen 或 Llama 模型一樣使用。例如，你可以使用 vLLM 輕鬆啟動一個服務：

vllm serve zhuyaoyu/CodeV-R1-Distill-Qwen-7B --tensor-parallel-size 2 --max-model-len 16384 --enforce-eager

💡 使用建議

在訓練和評估期間，我們使用了一個系統提示：

You are a helpful assistant. The assistant first thinks about the reasoning process in the mind and then provides the user with the answer. The reasoning process and answer are enclosed within <think> </think> and<answer> </answer> tags, respectively, i.e., <think> reasoning process here </think><answer> answer here </answer>.  Now the user asks you to write verilog code. After thinking, when you finally reach a conclusion, enclose the final verilog code in ```verilog ``` within <answer> </answer> tags. i.e., <answer> ```verilog\n module top_module(in, out, ...) ... ``` </answer>.\n

建議使用此提示。

✨ 主要特性

大語言模型（LLM）的後訓練階段發展迅速，如 OpenAI 的 GPT-o1、DeepSeek-R1 和 Kimi-1.5 等模型展現出了卓越的推理能力。然而，像 Verilog 這樣的硬件描述語言（HDL）面臨著類似低資源語言的挑戰，包括高質量指令跟隨數據有限以及模型在生成準確的寄存器傳輸級（RTL）代碼方面的能力受限。為了解決這些問題，我們提出利用知識蒸餾為較小、高效的模型賦予類似 DeepSeek-R1 的推理能力。

作為 CodeV 工作的延續，我們引入了 CodeV-R1-Distill-Qwen-7B。該模型具有以下特性：

性能優越：在主要的 Verilog 基準測試中，該模型優於先前的非推理大語言模型，展示了卓越的代碼合成和問題解決能力。
增強推理能力：提煉 Verilog 代碼還增強了模型的數學推理能力，表明以硬件為中心的訓練與一般邏輯推理之間存在更廣泛的協同作用。

📦 安裝指南

文檔未提及安裝相關內容，故跳過該章節。

💻 使用示例

基礎用法

使用 vLLM 啟動服務的示例：

vllm serve zhuyaoyu/CodeV-R1-Distill-Qwen-7B --tensor-parallel-size 2 --max-model-len 16384 --enforce-eager

📚 詳細文檔

模型概述

屬性	詳情
基礎模型	Qwen/Qwen2.5-Coder-7B-Instruct
庫名稱	transformers
標籤	verilog

數據準備

最初，我們使用 Deepseek-v3 對原始 CodeV 數據集中的問題進行重新總結和表述。然後，我們過濾掉那些 Qwen2.5-Coder-7B-Instruct 和 Qwen2.5-Coder-32B-Instruct 在五次嘗試內能夠解決的簡單問題，以及存在不可綜合問題的問題。對於剩餘的數據，我們使用 DeepSeek-R1 為每個問題生成一個響應。與基準測試問題相比，Rouge-L 分數大於 0.5 的問題也會被過濾掉。經過這些處理後，大約剩下 87,000 個（問題，代碼）對。

訓練過程

我們使用 LLaMAFactory 對 Qwen2.5-Coder-7B-Instruct 進行監督微調（SFT），使用這 87,000 對精煉後的數據集。訓練進行了六個 epoch，學習率為 1e-5，批量大小為 64。

評估結果

在評估階段，最大生成長度配置為 16,384 個 token。應用了 0.6 的溫度設置，每個查詢生成 20 個響應以估計 pass@1 分數。

我們的評估涵蓋了 Verilog 基準測試，包括 VerilogEval 和 RTLLM。對於 VerilogEval v2，我們研究了規範到 RTL 翻譯和代碼完成任務中的零樣本場景。對於 RTLLM，報告的是版本 1.1 的結果，該版本提供了更廣泛的比較分析。此外，我們發現通過 DeepSeek-R1 獲得的 Verilog 問題推理過程增強了模型的域外數學能力。

VerilogEval (v2)

模型	模型大小	類型	規範到 RTL	代碼完成
GPT-4o	未披露	通用	62.5%	59.0%
GPT-4 Turbo	未披露	通用	61.1%	53.9%
GPT-4	未披露	通用	32.0%	42.3%
Mistral Large	未披露	通用	37.5%	34.0%
Llama3.1	405B	通用	57.2%	56.4%
Llama3.1	70B	通用	42.8%	35.3%
Llama3	70B	通用	43.9%	37.8%
Llama2	70B	通用	5.3%	1.3%
Llama3.1	8B	通用	19.1%	2.6%
CodeLlama	70B	編碼	34.9%	37.2%
DeepSeek Coder	33B	編碼	21.7%	25.0%
CodeGemma	7B	編碼	9.5%	8.3%
DeepSeek Coder	6.7B	編碼	29.6%	24.4%
RTL-Coder	6.7B	Verilog RTL	36.8%	35.9%
CodeV-R1-distill (我們的模型)	7B	Verilog RTL	65.4%	65.1%

RTLLM (v1.1)

模型	模型大小	類型	Pass@1
GPT-4o	未披露	通用	33.8%
GPT-3.5 Turbo	未披露	通用	28.3%
Llama3.1	405B	通用	38.9%
Nemotron-4	340B	通用	18.9%
Llama3.1	8B	通用	19.1%
CodeLlama	7B	編碼	17.9%
CodeQwen	7B	編碼	24.1%
Starcoder2	15B	編碼	15.5%
DeepSeek Coder	6.7B	編碼	23.1%
DeepSeek-Coder-V2	16B	編碼	33.1%
DeepSeek-Coder-V2	236B	編碼	34.5%
RTL-Coder	6.7B	Verilog RTL	36.8%
CraftRTL	6.7B	Verilog RTL	53.1%
CodeV-R1-distill (我們的模型)	7B	Verilog RTL	56.2%

數學評估

模型	AIME	Math	AMC	Minerva	奧林匹克基準	平均
Qwen2.5-7b-instruct-1M	11.25%	72.61%	41.11%	25.92%	34.66%	37.11%
Qwen2.5-math-7b-instruct	12.08%	82.25%	49.4%	27.64%	37.31%	41.74%
Qwen2.5-coder-7b-instruct (基線)	5.63%	63.5%	35.62%	21.02%	28.64%	30.88%
CodeV-R1-distill (我們的模型)	11.04%	74.35%	45.86%	25.79%	38.7%	39.15%

🔧 技術細節

文檔未提及技術實現細節相關內容，故跳過該章節。

📄 許可證

CodeV-R1-Distill-Qwen-7B 源自 Qwen-2.5 系列，該系列最初根據 Apache 2.0 許可證許可，現在使用由 DeepSeek-R1 整理的 87k 樣本進行微調。

引用

@misc{CodeV-R1-Distill-Qwen-7B,
  author = {IPRC-DIP},
  title = {CodeV Model Distilled from DeepSeek-R1},
  url = {https://huggingface.co/zhuyaoyu/CodeV-R1-Distill-Qwen-7B},
  year = {2025}
}