Mistral-7B-Instruct-v0.2-fp8開源AI模型 - 高精度且推理效率顯著提升

首頁

Mistral 7B Instruct V0.2 Fp8

由FriendliAI開發

由FriendliAI量化為FP8精度的Mistral-7B-Instruct-v0.2模型，在保持高精度的同時顯著提升推理效率。

大型語言模型

Transformers

開源協議:Apache-2.0 #32K長文本支持 #指令微調優化 #FP8高效推理

下載量 37

發布時間 : 3/28/2024

模型概述

基於Mistral-7B-v0.2進行指令微調的大語言模型，支持32k上下文窗口，適用於對話和指令跟隨任務。

模型特點

FP8量化

通過FP8量化技術優化，顯著提升推理效率同時保持高精度

擴展上下文窗口

支持32k tokens的長上下文處理能力（v0.2版本新增特性）

指令優化

專為指令跟隨任務優化，支持複雜的對話交互

模型能力

文本生成

對話系統

指令跟隨

內容創作

使用案例

對話系統

智能助手

構建能理解複雜指令的對話助手

可生成自然流暢的對話響應

內容生成

創意寫作

輔助進行故事創作和內容生成

能生成連貫的創意文本

🚀 Mistral-7B-Instruct-v0.2 - FP8

本項目包含由 FriendliAI 量化為 FP8 的 Mistral-7B-Instruct-v0.2 模型，在保持高精度的同時顯著提高了推理效率。

Friendli Logo

🚀 快速開始

本模型使用前需完成一些準備工作，具體步驟如下：

確保已註冊 Friendli Suite，可免費使用 Friendli 容器四周。
按照此指南準備個人訪問令牌（PAT）。
按照此指南準備 Friendli 容器密鑰。

準備個人訪問令牌

個人訪問令牌（PAT）是用於登錄容器註冊表的用戶憑證。

登錄 Friendli Suite。
進入 用戶設置 > 令牌 並點擊 '創建新令牌'。
保存創建的令牌值。

準備容器密鑰

容器密鑰是啟動 Friendli 容器鏡像的憑證，需將其作為環境變量傳遞以運行容器鏡像。

登錄 Friendli Suite。
進入 容器 > 容器密鑰 並點擊 '創建密鑰'。
保存創建的密鑰值。

拉取 Friendli 容器鏡像

使用按照此指南創建的個人訪問令牌登錄 Docker 客戶端。

export FRIENDLI_PAT="YOUR PAT"
docker login registry.friendli.ai -u $YOUR_EMAIL -p $FRIENDLI_PAT

拉取鏡像

docker pull registry.friendli.ai/trial

運行 Friendli 容器

準備好 Friendli 容器鏡像後，可啟動它以創建服務端點。

docker run \
  --gpus '"device=0"' \
  -p 8000:8000 \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  -e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
  registry.friendli.ai/trial \
    --web-server-port 8000 \
    --hf-model-name FriendliAI/Mistral-7B-Instruct-v0.2-fp8

✨ 主要特性

本倉庫包含由 FriendliAI 量化為 FP8 的 Mistral-7B-Instruct-v0.2 模型，在保持高精度的同時顯著提高了推理效率。
注意，FP8 僅受 NVIDIA Ada、Hopper 和 Blackwell GPU 架構支持。
此模型與 Friendli 容器 兼容。

📦 安裝指南

拉取 Friendli 容器鏡像

使用按照此指南創建的個人訪問令牌登錄 Docker 客戶端。

export FRIENDLI_PAT="YOUR PAT"
docker login registry.friendli.ai -u $YOUR_EMAIL -p $FRIENDLI_PAT

拉取鏡像

docker pull registry.friendli.ai/trial

運行 Friendli 容器

準備好 Friendli 容器鏡像後，可啟動它以創建服務端點。

docker run \
  --gpus '"device=0"' \
  -p 8000:8000 \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  -e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
  registry.friendli.ai/trial \
    --web-server-port 8000 \
    --hf-model-name FriendliAI/Mistral-7B-Instruct-v0.2-fp8

💻 使用示例

基礎用法

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = encodeds.to(device)
model.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

📚 詳細文檔

模型信息

屬性	詳情
模型創建者	Mistral AI
原始模型	Mistral-7B-Instruct-v0.2
模型類型	文本生成
量化者	FriendliAI
許可證	請參考原始模型卡的許可證

指令格式

為了利用指令微調，您的提示應包含在 [INST] 和 [/INST] 標記中。第一條指令應從句子起始 ID 開始，後續指令則不需要。助手生成的內容將由句子結束令牌 ID 結束。

例如：

text = "<s>[INST] What is your favourite condiment? [/INST]"
"Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s> "
"[INST] Do you have mayonnaise recipes? [/INST]"

此格式可通過 apply_chat_template() 方法作為聊天模板使用。

故障排除

如果遇到以下錯誤：

Traceback (most recent call last):
File "", line 1, in
File "/transformers/models/auto/auto_factory.py", line 482, in from_pretrained
config, kwargs = AutoConfig.from_pretrained(
File "/transformers/models/auto/configuration_auto.py", line 1022, in from_pretrained
config_class = CONFIG_MAPPING[config_dict["model_type"]]
File "/transformers/models/auto/configuration_auto.py", line 723, in getitem
raise KeyError(key)
KeyError: 'mistral'

從源代碼安裝 transformers 應該可以解決此問題

pip install git+https://github.com/huggingface/transformers

在 transformers-v4.33.4 之後，應該不需要這樣做。

侷限性

Mistral 7B Instruct 模型是一個快速演示，表明基礎模型可以很容易地進行微調以實現出色的性能。它沒有任何審核機制。我們期待與社區合作，探討如何使模型更好地遵守規則，以便在需要審核輸出的環境中部署。

Mistral AI 團隊

Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Louis Ternon, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.

🔧 技術細節

Mistral-7B-v0.2 與 Mistral-7B-v0.1 相比有以下變化：

32k 上下文窗口（v0.1 為 8k 上下文）
Rope-theta = 1e6
無滑動窗口注意力

有關此模型的完整詳細信息，請閱讀我們的論文和發佈博客文章。

📄 許可證

請參考原始模型卡的許可證。

⚠️ 重要提示

FP8 僅受 NVIDIA Ada、Hopper 和 Blackwell GPU 架構支持。

💡 使用建議

若遇到 KeyError: 'mistral' 錯誤，可從源代碼安裝 transformers 解決，命令為 pip install git+https://github.com/huggingface/transformers，在 transformers-v4.33.4 之後應該不需要這樣做。