🚀 DeepSeek-R1-Distill-Llama-3B
DeepSeek-R1-Distill-Llama-3B 是基於 Llama-3.2-3B 模型,使用 R1-Distill-SFT 數據集對 DeepSeek-R1 進行蒸餾得到的版本。它具備文本生成能力,在多個文本生成任務的評估中展現出了一定的性能。

✨ 主要特性
- 基於 Llama-3.2-3B 模型進行蒸餾,結合了 R1-Distill-SFT 數據集的特點。
- 支持 Llama3 提示模板,方便用戶使用。
- 在多個文本生成任務的評估中取得了一定的成績。
📦 安裝指南
文檔未提供具體安裝步驟,暫不展示。
💻 使用示例
基礎用法
你可以使用 Llama3 提示模板來使用該模型:
Llama3
<|start_header_id|>system<|end_header_id|>
{system}<|eot_id|>
<|start_header_id|>user<|end_header_id|>
{user}<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
{assistant}<|eot_id|>
代碼示例
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"suayptalha/DeepSeek-R1-Distill-Llama-3B",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B")
SYSTEM_PROMPT = """Respond in the following format:
<think>
You should reason between these tags.
</think>
Answer goes here...
Always use <think> </think> tags even if they are not necessary.
"""
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "Which one is larger? 9.11 or 9.9?"},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize = True,
add_generation_prompt = True,
return_tensors = "pt",
).to("cuda")
output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=False)
print(decoded_output)
輸出示例
<think>
First, I need to compare the two numbers 9.11 and 9.9.
Next, I'll analyze each number. The first digit after the decimal point in 9.11 is 1, and in 9.9, it's 9.
Since 9 is greater than 1, 9.9 is larger than 9.11.
</think>
To determine which number is larger, let's compare the two numbers:
**9.11** and **9.9**
1. **Identify the Decimal Places:**
- Both numbers have two decimal places.
2. **Compare the Tens Place (Right of the Decimal Point):**
- **9.11:** The tens place is 1.
- **9.9:** The tens place is 9.
3. **Conclusion:**
- Since 9 is greater than 1, the number with the larger tens place is 9.9.
**Answer:** **9.9** is larger than **9.11**.
建議的系統提示
Respond in the following format:
<think>
You should reason between these tags.
</think>
Answer goes here...
Always use <think> </think> tags even if they are not necessary.
📚 詳細文檔
參數說明
參數 |
詳情 |
lr |
2e-5 |
epochs |
1 |
batch_size |
16 |
optimizer |
paged_adamw_8bit |
詳細結果可查看 此處
指標 |
值 |
平均 |
23.27 |
IFEval (0-Shot) |
70.93 |
BBH (3-Shot) |
21.45 |
MATH Lvl 5 (4-Shot) |
20.92 |
GPQA (0-shot) |
1.45 |
MuSR (0-shot) |
2.91 |
MMLU-PRO (5-shot) |
21.98 |
🔧 技術細節
查看 axolotl 配置
base_model: unsloth/Llama-3.2-3B-Instruct
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
load_in_8bit: true
load_in_4bit: false
strict: false
chat_template: llama3
datasets:
- path: ./custom_dataset.json
type: chat_template
conversation: chatml
ds_type: json
add_bos_token: true
add_eos_token: true
use_default_system_prompt: false
special_tokens:
bos_token: "<|begin_of_text|>"
eos_token: "<|eot_id|>"
pad_token: "<|eot_id|>"
additional_special_tokens:
- "<|begin_of_text|>"
- "<|eot_id|>"
adapter: lora
lora_model_dir:
lora_r: 16
lora_alpha: 32
lora_dropout: 0.1
lora_target_linear: true
hub_model_id: suayptalha/DeepSeek-R1-Distill-Llama-3B
sequence_len: 2048
sample_packing: false
pad_to_sequence_len: true
micro_batch_size: 2
gradient_accumulation_steps: 8
num_epochs: 1
learning_rate: 2e-5
optimizer: paged_adamw_8bit
lr_scheduler: cosine
train_on_inputs: false
group_by_length: false
bf16: false
fp16: true
tf32: false
gradient_checkpointing: true
flash_attention: false
logging_steps: 50
warmup_steps: 100
saves_per_epoch: 1
output_dir: ./finetune-sft-results
save_safetensors: true
📄 許可證
本項目採用 MIT 許可證。
支持
如果你覺得該項目有幫助,可以通過以下方式支持開發者:
