DeepSeek-R1-Distill-Llama-3B開源模型 - 免費實現強大文本生成功能

Home

Deepseek R1 Distill Llama 3B

Developed by suayptalha

DeepSeek-R1-Distill-Llama-3B 是基於 Llama-3.2-3B 模型，使用 R1-Distill-SFT 數據集對 DeepSeek-R1 進行蒸餾得到的版本，具備文本生成能力。

大型語言模型

Transformers

EnglishOpen Source License:MIT #Llama3蒸餾模型 #結構化推理生成 #多任務文本生成

Downloads 781

Release Time : 2/23/2025

Model Overview

該模型是通過蒸餾 Llama-3.2-3B 模型並結合 R1-Distill-SFT 數據集訓練而成，主要用於文本生成任務。

Model Features

基於 Llama-3.2-3B 蒸餾

通過蒸餾技術優化了 Llama-3.2-3B 模型，提升了性能。

支持 Llama3 提示模板

兼容 Llama3 的提示模板，便於用戶使用。

文本生成能力

在多個文本生成任務中表現良好。

Model Capabilities

文本生成

推理能力

多輪對話

Use Cases

問答系統

數值比較

比較兩個數值的大小，並提供推理過程。

模型能夠正確比較數值並輸出詳細的推理過程。

教育輔助

數學問題解答

解答基礎數學問題，並展示推理步驟。

模型能夠解答問題並展示清晰的推理過程。

🚀 DeepSeek-R1-Distill-Llama-3B

DeepSeek-R1-Distill-Llama-3B 是基於 Llama-3.2-3B 模型，使用 R1-Distill-SFT 數據集對 DeepSeek-R1 進行蒸餾得到的版本。它具備文本生成能力，在多個文本生成任務的評估中展現出了一定的性能。

✨ 主要特性

基於 Llama-3.2-3B 模型進行蒸餾，結合了 R1-Distill-SFT 數據集的特點。
支持 Llama3 提示模板，方便用戶使用。
在多個文本生成任務的評估中取得了一定的成績。

📦 安裝指南

文檔未提供具體安裝步驟，暫不展示。

💻 使用示例

基礎用法

你可以使用 Llama3 提示模板來使用該模型：

Llama3

<|start_header_id|>system<|end_header_id|>
{system}<|eot_id|>

<|start_header_id|>user<|end_header_id|>
{user}<|eot_id|>

<|start_header_id|>assistant<|end_header_id|>
{assistant}<|eot_id|>

代碼示例

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "suayptalha/DeepSeek-R1-Distill-Llama-3B",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B")

SYSTEM_PROMPT = """Respond in the following format:
<think>
You should reason between these tags.
</think>

Answer goes here...

Always use <think> </think> tags even if they are not necessary.
"""

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "Which one is larger? 9.11 or 9.9?"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True,
    return_tensors = "pt",
).to("cuda")
output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=False)
print(decoded_output)

輸出示例

<think>
First, I need to compare the two numbers 9.11 and 9.9. 

Next, I'll analyze each number. The first digit after the decimal point in 9.11 is 1, and in 9.9, it's 9. 

Since 9 is greater than 1, 9.9 is larger than 9.11.
</think>

To determine which number is larger, let's compare the two numbers:

**9.11** and **9.9**

1. **Identify the Decimal Places:**
   - Both numbers have two decimal places.
   
2. **Compare the Tens Place (Right of the Decimal Point):**
   - **9.11:** The tens place is 1.
   - **9.9:** The tens place is 9.
   
3. **Conclusion:**
   - Since 9 is greater than 1, the number with the larger tens place is 9.9.
   
**Answer:** **9.9** is larger than **9.11**.

建議的系統提示

Respond in the following format:
<think>
You should reason between these tags.
</think>

Answer goes here...

Always use <think> </think> tags even if they are not necessary.

📚 詳細文檔

參數說明

參數	詳情
lr	2e-5
epochs	1
batch_size	16
optimizer	paged_adamw_8bit

Open LLM Leaderboard 評估結果

詳細結果可查看此處

指標	值
平均	23.27
IFEval (0-Shot)	70.93
BBH (3-Shot)	21.45
MATH Lvl 5 (4-Shot)	20.92
GPQA (0-shot)	1.45
MuSR (0-shot)	2.91
MMLU-PRO (5-shot)	21.98

🔧 技術細節

查看 axolotl 配置

base_model: unsloth/Llama-3.2-3B-Instruct
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: true
load_in_4bit: false
strict: false

chat_template: llama3
datasets:
  - path: ./custom_dataset.json
    type: chat_template
    conversation: chatml
    ds_type: json

add_bos_token: true
add_eos_token: true
use_default_system_prompt: false

special_tokens:
  bos_token: "<|begin_of_text|>"
  eos_token: "<|eot_id|>"
  pad_token: "<|eot_id|>"
  additional_special_tokens:
    - "<|begin_of_text|>"
    - "<|eot_id|>"

adapter: lora
lora_model_dir:
lora_r: 16
lora_alpha: 32
lora_dropout: 0.1
lora_target_linear: true

hub_model_id: suayptalha/DeepSeek-R1-Distill-Llama-3B

sequence_len: 2048
sample_packing: false
pad_to_sequence_len: true
micro_batch_size: 2
gradient_accumulation_steps: 8
num_epochs: 1
learning_rate: 2e-5
optimizer: paged_adamw_8bit
lr_scheduler: cosine

train_on_inputs: false
group_by_length: false
bf16: false
fp16: true
tf32: false

gradient_checkpointing: true
flash_attention: false

logging_steps: 50
warmup_steps: 100
saves_per_epoch: 1

output_dir: ./finetune-sft-results
save_safetensors: true