calme-2.3-llama3-70b開源大語言模型 - 微調優化多項測試表現出色

首頁

Calme 2.3 Llama3 70b

由MaziyarPanahi開發

基於Meta-Llama-3-70B-Instruct模型通過DPO微調的大語言模型，在多項基準測試中表現優異

大型語言模型

Transformers

英語#DPO微調 #多任務推理 #高準確率

下載量 31

發布時間 : 4/27/2024

模型概述

該模型是對Meta-Llama-3-70B-Instruct進行微調(DPO)的版本，專注於提高文本生成質量和事實準確性

模型特點

DPO微調

使用直接偏好優化(DPO)方法微調，提高模型輸出質量

高性能

在Open LLM排行榜多項基準測試中表現優異，平均得分78.74

ChatML支持

使用ChatML提示模板，便於對話式交互

模型能力

文本生成

對話系統

問答系統

知識推理

使用案例

智能助手

對話機器人

可用於構建專業領域的智能對話系統

教育

知識問答

可用於教育領域的知識問答系統

🚀 MaziyarPanahi/calme-2.3-llama3-70b

這是一個基於meta-llama/Meta-Llama-3-70B-Instruct模型微調（DPO）的模型，可用於文本生成任務，在多個數據集上有不錯的表現。

Llama-3 DPO Logo

🚀 快速開始

你可以在Hugging Face的transformers庫中使用MaziyarPanahi/calme-2.3-llama3-70b作為模型名稱來使用此模型。

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
from transformers import pipeline
import torch

model_id = "MaziyarPanahi/calme-2.3-llama3-70b"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
    # attn_implementation="flash_attention_2"
)

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True
)

streamer = TextStreamer(tokenizer)

pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    model_kwargs={"torch_dtype": torch.bfloat16},
    streamer=streamer
)

# 然後你可以使用pipeline來生成文本。

messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|im_end|>"),
    tokenizer.convert_tokens_to_ids("<|eot_id|>") # 加上這個更保險
]

outputs = pipeline(
    prompt,
    max_new_tokens=2048,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.95,
)
print(outputs[0]["generated_text"][len(prompt):])

✨ 主要特性

基於meta-llama/Meta-Llama-3-70B-Instruct模型進行微調（DPO）。
支持ChatML提示模板。
提供量化的GGUF模型。
在多個數據集上有良好的評估結果。

📦 量化的GGUF模型

所有GGUF模型可在此處獲取：MaziyarPanahi/calme-2.3-llama3-70b-GGUF

🏆 Open LLM Leaderboard評估結果

詳細結果可查看此處

指標	值
平均值	78.74
AI2推理挑戰（25次少樣本）	72.35
HellaSwag（10次少樣本）	86.00
MMLU（5次少樣本）	80.47
TruthfulQA（0次少樣本）	63.45
Winogrande（5次少樣本）	82.95
GSM8k（5次少樣本）	87.19

排行榜前10的模型 Llama-3-70B微調模型

📚 詳細文檔

提示模板

此模型使用ChatML提示模板：

<|im_start|>system
{系統提示}
<|im_end|>
<|im_start|>user
{用戶輸入}
<|im_end|>
<|im_start|>assistant
{模型輸出}

📄 許可證

此模型的許可證為llama3，具體許可證信息可查看LICENSE。

模型信息表格

屬性	詳情
模型類型	基於`meta-llama/Meta-Llama-3-70B-Instruct`的微調（DPO）模型
訓練數據	MaziyarPanahi/truthy-dpo-v0.1-axolotl
模型創建者	MaziyarPanahi
量化者	MaziyarPanahi
任務類型	文本生成
評估數據集	AI2 Reasoning Challenge、HellaSwag、MMLU、TruthfulQA、Winogrande、GSM8k 等