TinyMistral-248M-Chat-v4開源聊天模型 - 支持多輪對話，適用於各類場景

首頁

Tinymistral 248M Chat V4

由Felladrin開發

TinyMistral-248M-Chat是一個基於TinyMistral-248M微調的聊天模型，支持多輪對話，適用於各種對話場景。

大型語言模型

Transformers

英語開源協議:Apache-2.0 #輕量級對話模型 #多領域知識問答 #ChatML格式支持

下載量 516

發布時間 : 11/16/2023

模型概述

這是一個輕量級的聊天模型，基於TinyMistral-248M微調，支持多種對話任務，能夠理解並回應用戶的詢問。

模型特點

輕量級模型

僅248M參數，適合資源有限的環境。

多輪對話支持

能夠處理複雜的多輪對話場景。

多數據集訓練

使用了多個高質量數據集進行訓練，包括ultrachat_200k、OpenOrca等。

DPO微調

經過DPO微調，提升了對話質量。

模型能力

文本生成

多輪對話

問答系統

創意寫作

使用案例

對話系統

客服機器人

用於處理客戶諮詢和問題解答。

能夠提供準確且友好的回答。

個人助手

幫助用戶完成日常任務和提供信息。

能夠理解並回應用戶的需求。

教育

學習助手

幫助學生解答學習中的問題。

提供清晰且準確的學習建議。

🚀 TinyMistral-248M-Chat

TinyMistral-248M-Chat是基於特定基礎模型，利用多個數據集訓練得到的模型，可用於文本生成任務，具有特定的使用格式和訓練方式。

🚀 快速開始

本模型可用於文本生成任務，以下是使用時的一些關鍵信息：

基礎模型：Locutusque/TinyMistral-248M，添加了兩個特殊標記 (<|im_start|> 和 <|im_end|>)。
數據集：
許可證：Apache許可證2.0

✨ 主要特性

支持特定的提示格式，方便用戶輸入。
可用於文本生成任務，通過示例代碼可快速上手。

📦 安裝指南

文檔未提及安裝步驟，暫不提供相關內容。

💻 使用示例

基礎用法

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
import torch

model_path = "Felladrin/TinyMistral-248M-Chat-v4"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path).to(device)
streamer = TextStreamer(tokenizer)
messages = [
    {
        "role": "system",
        "content": "You are a highly knowledgeable and friendly assistant. Your goal is to understand and respond to user inquiries with clarity. Your interactions are always respectful, helpful, and focused on delivering the most accurate information to the user.",
    },
    {
        "role": "user",
        "content": "Hey! Got a question for you!",
    },
    {
        "role": "assistant",
        "content": "Sure! What's it?",
    },
    {
        "role": "user",
        "content": "What are some potential applications for quantum computing?",
    },
]
prompt = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(prompt, return_tensors="pt").to(device)
model.generate(
    inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_length=tokenizer.model_max_length,
    streamer=streamer,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
    do_sample=True,
    temperature=0.6,
    top_p=0.8,
    top_k=0,
    min_p=0.1,
    typical_p=0.2,
    repetition_penalty=1.176,
)

📚 詳細文檔

訓練方式

本模型使用 SFTTrainer 進行訓練，使用了以下設置：

超參數	值
學習率	2e - 5
總訓練批次大小	32
最大序列長度	2048
權重衰減	0.01
熱身比例	0.1
NEFTune噪聲阿爾法	5
優化器	Adam，beta=(0.9, 0.999)，epsilon = 1e - 08
調度器	餘弦
隨機種子	42

然後，通過 LLaMA - Factory 使用DPO對模型進行微調，使用了以下超參數和命令：

參數	值
數據集	HuggingFaceH4/ultrafeedback_binarized
學習率	1e - 06
訓練批次大小	4
評估批次大小	8
隨機種子	42
分佈式類型	多GPU
設備數量	8
梯度累積步數	4
總訓練批次大小	128
總評估批次大小	64
優化器	adamw_8bit，beta=(0.9, 0.999)，epsilon = 1e - 08
學習率調度器類型	餘弦
學習率調度器熱身比例	0.1
訓練輪數	2.0

llamafactory-cli train \
    --stage dpo \
    --do_train True \
    --model_name_or_path ~/TinyMistral-248M-Chat \
    --preprocessing_num_workers $(python -c "import os; print(max(1, os.cpu_count() - 2))") \
    --dataloader_num_workers $(python -c "import os; print(max(1, os.cpu_count() - 2))") \
    --finetuning_type full \
    --flash_attn auto \
    --enable_liger_kernel True \
    --dataset_dir data \
    --dataset ultrafeedback \
    --cutoff_len 1024 \
    --learning_rate 1e-6 \
    --num_train_epochs 2.0 \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type linear \
    --max_grad_norm 1.0 \
    --logging_steps 10 \
    --save_steps 50 \
    --save_total_limit 1 \
    --warmup_ratio 0.1 \
    --packing False \
    --report_to tensorboard \
    --output_dir ~/TinyMistral-248M-Chat-v4 \
    --pure_bf16 True \
    --plot_loss True \
    --trust_remote_code True \
    --ddp_timeout 180000000 \
    --include_tokens_per_second True \
    --include_num_input_tokens_seen True \
    --optim adamw_8bit \
    --pref_beta 0.5 \
    --pref_ftx 0 \
    --pref_loss simpo \
    --gradient_checkpointing True