DeepSeek-R1-Distill-Llama-3B开源模型 - 免费实现强大文本生成功能

首页

Deepseek R1 Distill Llama 3B

由 suayptalha 开发

DeepSeek-R1-Distill-Llama-3B 是基于 Llama-3.2-3B 模型，使用 R1-Distill-SFT 数据集对 DeepSeek-R1 进行蒸馏得到的版本，具备文本生成能力。

大型语言模型

Transformers

英语开源协议:MIT #Llama3蒸馏模型 #结构化推理生成 #多任务文本生成

下载量 781

发布时间 : 2/23/2025

模型简介

该模型是通过蒸馏 Llama-3.2-3B 模型并结合 R1-Distill-SFT 数据集训练而成，主要用于文本生成任务。

模型特点

基于 Llama-3.2-3B 蒸馏

通过蒸馏技术优化了 Llama-3.2-3B 模型，提升了性能。

支持 Llama3 提示模板

兼容 Llama3 的提示模板，便于用户使用。

文本生成能力

在多个文本生成任务中表现良好。

模型能力

文本生成

推理能力

多轮对话

使用案例

问答系统

数值比较

比较两个数值的大小，并提供推理过程。

模型能够正确比较数值并输出详细的推理过程。

教育辅助

数学问题解答

解答基础数学问题，并展示推理步骤。

模型能够解答问题并展示清晰的推理过程。

🚀 DeepSeek-R1-Distill-Llama-3B

DeepSeek-R1-Distill-Llama-3B 是基于 Llama-3.2-3B 模型，使用 R1-Distill-SFT 数据集对 DeepSeek-R1 进行蒸馏得到的版本。它具备文本生成能力，在多个文本生成任务的评估中展现出了一定的性能。

✨ 主要特性

基于 Llama-3.2-3B 模型进行蒸馏，结合了 R1-Distill-SFT 数据集的特点。
支持 Llama3 提示模板，方便用户使用。
在多个文本生成任务的评估中取得了一定的成绩。

📦 安装指南

文档未提供具体安装步骤，暂不展示。

💻 使用示例

基础用法

你可以使用 Llama3 提示模板来使用该模型：

Llama3

<|start_header_id|>system<|end_header_id|>
{system}<|eot_id|>

<|start_header_id|>user<|end_header_id|>
{user}<|eot_id|>

<|start_header_id|>assistant<|end_header_id|>
{assistant}<|eot_id|>

代码示例

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "suayptalha/DeepSeek-R1-Distill-Llama-3B",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B")

SYSTEM_PROMPT = """Respond in the following format:
<think>
You should reason between these tags.
</think>

Answer goes here...

Always use <think> </think> tags even if they are not necessary.
"""

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "Which one is larger? 9.11 or 9.9?"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True,
    return_tensors = "pt",
).to("cuda")
output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=False)
print(decoded_output)

输出示例

<think>
First, I need to compare the two numbers 9.11 and 9.9. 

Next, I'll analyze each number. The first digit after the decimal point in 9.11 is 1, and in 9.9, it's 9. 

Since 9 is greater than 1, 9.9 is larger than 9.11.
</think>

To determine which number is larger, let's compare the two numbers:

**9.11** and **9.9**

1. **Identify the Decimal Places:**
   - Both numbers have two decimal places.
   
2. **Compare the Tens Place (Right of the Decimal Point):**
   - **9.11:** The tens place is 1.
   - **9.9:** The tens place is 9.
   
3. **Conclusion:**
   - Since 9 is greater than 1, the number with the larger tens place is 9.9.
   
**Answer:** **9.9** is larger than **9.11**.

建议的系统提示

Respond in the following format:
<think>
You should reason between these tags.
</think>

Answer goes here...

Always use <think> </think> tags even if they are not necessary.

📚 详细文档

参数说明

参数	详情
lr	2e-5
epochs	1
batch_size	16
optimizer	paged_adamw_8bit

Open LLM Leaderboard 评估结果

详细结果可查看此处

指标	值
平均	23.27
IFEval (0-Shot)	70.93
BBH (3-Shot)	21.45
MATH Lvl 5 (4-Shot)	20.92
GPQA (0-shot)	1.45
MuSR (0-shot)	2.91
MMLU-PRO (5-shot)	21.98

🔧 技术细节

查看 axolotl 配置

base_model: unsloth/Llama-3.2-3B-Instruct
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: true
load_in_4bit: false
strict: false

chat_template: llama3
datasets:
  - path: ./custom_dataset.json
    type: chat_template
    conversation: chatml
    ds_type: json

add_bos_token: true
add_eos_token: true
use_default_system_prompt: false

special_tokens:
  bos_token: "<|begin_of_text|>"
  eos_token: "<|eot_id|>"
  pad_token: "<|eot_id|>"
  additional_special_tokens:
    - "<|begin_of_text|>"
    - "<|eot_id|>"

adapter: lora
lora_model_dir:
lora_r: 16
lora_alpha: 32
lora_dropout: 0.1
lora_target_linear: true

hub_model_id: suayptalha/DeepSeek-R1-Distill-Llama-3B

sequence_len: 2048
sample_packing: false
pad_to_sequence_len: true
micro_batch_size: 2
gradient_accumulation_steps: 8
num_epochs: 1
learning_rate: 2e-5
optimizer: paged_adamw_8bit
lr_scheduler: cosine

train_on_inputs: false
group_by_length: false
bf16: false
fp16: true
tf32: false

gradient_checkpointing: true
flash_attention: false

logging_steps: 50
warmup_steps: 100
saves_per_epoch: 1

output_dir: ./finetune-sft-results
save_safetensors: true