🚀 DeepSeek-R1-Distill-Llama-3B
DeepSeek-R1-Distill-Llama-3B 是基于 Llama-3.2-3B 模型,使用 R1-Distill-SFT 数据集对 DeepSeek-R1 进行蒸馏得到的版本。它具备文本生成能力,在多个文本生成任务的评估中展现出了一定的性能。

✨ 主要特性
- 基于 Llama-3.2-3B 模型进行蒸馏,结合了 R1-Distill-SFT 数据集的特点。
- 支持 Llama3 提示模板,方便用户使用。
- 在多个文本生成任务的评估中取得了一定的成绩。
📦 安装指南
文档未提供具体安装步骤,暂不展示。
💻 使用示例
基础用法
你可以使用 Llama3 提示模板来使用该模型:
Llama3
<|start_header_id|>system<|end_header_id|>
{system}<|eot_id|>
<|start_header_id|>user<|end_header_id|>
{user}<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
{assistant}<|eot_id|>
代码示例
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"suayptalha/DeepSeek-R1-Distill-Llama-3B",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B")
SYSTEM_PROMPT = """Respond in the following format:
<think>
You should reason between these tags.
</think>
Answer goes here...
Always use <think> </think> tags even if they are not necessary.
"""
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "Which one is larger? 9.11 or 9.9?"},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize = True,
add_generation_prompt = True,
return_tensors = "pt",
).to("cuda")
output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=False)
print(decoded_output)
输出示例
<think>
First, I need to compare the two numbers 9.11 and 9.9.
Next, I'll analyze each number. The first digit after the decimal point in 9.11 is 1, and in 9.9, it's 9.
Since 9 is greater than 1, 9.9 is larger than 9.11.
</think>
To determine which number is larger, let's compare the two numbers:
**9.11** and **9.9**
1. **Identify the Decimal Places:**
- Both numbers have two decimal places.
2. **Compare the Tens Place (Right of the Decimal Point):**
- **9.11:** The tens place is 1.
- **9.9:** The tens place is 9.
3. **Conclusion:**
- Since 9 is greater than 1, the number with the larger tens place is 9.9.
**Answer:** **9.9** is larger than **9.11**.
建议的系统提示
Respond in the following format:
<think>
You should reason between these tags.
</think>
Answer goes here...
Always use <think> </think> tags even if they are not necessary.
📚 详细文档
参数说明
参数 |
详情 |
lr |
2e-5 |
epochs |
1 |
batch_size |
16 |
optimizer |
paged_adamw_8bit |
详细结果可查看 此处
指标 |
值 |
平均 |
23.27 |
IFEval (0-Shot) |
70.93 |
BBH (3-Shot) |
21.45 |
MATH Lvl 5 (4-Shot) |
20.92 |
GPQA (0-shot) |
1.45 |
MuSR (0-shot) |
2.91 |
MMLU-PRO (5-shot) |
21.98 |
🔧 技术细节
查看 axolotl 配置
base_model: unsloth/Llama-3.2-3B-Instruct
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
load_in_8bit: true
load_in_4bit: false
strict: false
chat_template: llama3
datasets:
- path: ./custom_dataset.json
type: chat_template
conversation: chatml
ds_type: json
add_bos_token: true
add_eos_token: true
use_default_system_prompt: false
special_tokens:
bos_token: "<|begin_of_text|>"
eos_token: "<|eot_id|>"
pad_token: "<|eot_id|>"
additional_special_tokens:
- "<|begin_of_text|>"
- "<|eot_id|>"
adapter: lora
lora_model_dir:
lora_r: 16
lora_alpha: 32
lora_dropout: 0.1
lora_target_linear: true
hub_model_id: suayptalha/DeepSeek-R1-Distill-Llama-3B
sequence_len: 2048
sample_packing: false
pad_to_sequence_len: true
micro_batch_size: 2
gradient_accumulation_steps: 8
num_epochs: 1
learning_rate: 2e-5
optimizer: paged_adamw_8bit
lr_scheduler: cosine
train_on_inputs: false
group_by_length: false
bf16: false
fp16: true
tf32: false
gradient_checkpointing: true
flash_attention: false
logging_steps: 50
warmup_steps: 100
saves_per_epoch: 1
output_dir: ./finetune-sft-results
save_safetensors: true
📄 许可证
本项目采用 MIT 许可证。
支持
如果你觉得该项目有帮助,可以通过以下方式支持开发者:
