Mistral-7B-Instruct-v0.3-AWQ開源大語言模型 - 免費部署高效問答支持

首頁

Mistral 7B Instruct V0.3 AWQ

由solidrust開發

Mistral-7B-Instruct-v0.3是基於Mistral-7B-v0.3進行指令微調的大語言模型，採用4位AWQ量化技術優化推理效率

大型語言模型

Transformers

開源協議:Apache-2.0 #4位量化推理 #指令微調大模型 #函數調用支持

下載量 48.24k

發布時間 : 5/23/2024

模型概述

一個經過指令微調的大語言模型，支持函數調用功能，適用於文本生成任務

模型特點

4位AWQ量化

採用高效精準的4位權重量化技術，在保持模型質量的同時提升推理速度

擴展詞表

詞表擴展至32768個token，提升模型表達能力

函數調用支持

支持函數調用功能，增強模型實用性

多平臺兼容

支持多種推理平臺和框架，包括text-generation-webui、vLLM、Hugging Face TGI等

模型能力

文本生成

指令理解

函數調用

對話交互

使用案例

智能助手

問答系統

回答用戶提出的各類問題

能提供準確、詳細的回答

邏輯推理

解決複雜的邏輯和數學問題

可處理類似'南西北走'等空間推理問題

開發者工具

API集成

通過函數調用功能集成到應用程序中

支持開發者構建更智能的應用

🚀 mistralai/Mistral-7B-Instruct-v0.3 AWQ

本項目是Mistral-7B-Instruct-v0.3模型的AWQ量化版本，能高效進行文本生成任務，在特定硬件和軟件環境下可快速推理。

🚀 快速開始

安裝必要的包

pip install --upgrade autoawq autoawq-kernels

Python代碼示例

基礎用法

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer, TextStreamer

model_path = "solidrust/Mistral-7B-Instruct-v0.3-AWQ"
system_message = "You are Mistral-7B-Instruct-v0.3, incarnated as a powerful AI. You were created by mistralai."

# Load model
model = AutoAWQForCausalLM.from_quantized(model_path,
                                          fuse_layers=True)
tokenizer = AutoTokenizer.from_pretrained(model_path,
                                          trust_remote_code=True)
streamer = TextStreamer(tokenizer,
                        skip_prompt=True,
                        skip_special_tokens=True)

# Convert prompt to tokens
prompt_template = """\
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant"""

prompt = "You're standing on the surface of the Earth. "\
        "You walk one mile south, one mile west and one mile north. "\
        "You end up exactly where you started. Where are you?"

tokens = tokenizer(prompt_template.format(system_message=system_message,prompt=prompt),
                  return_tensors='pt').input_ids.cuda()

# Generate output
generation_output = model.generate(tokens,
                                  streamer=streamer,
                                  max_new_tokens=512)

✨ 主要特性

模型創建者：mistralai
原始模型：Mistral-7B-Instruct-v0.3
模型概要：Mistral-7B-Instruct-v0.3大語言模型（LLM）是Mistral-7B-v0.3的指令微調版本。與Mistral-7B-v0.2相比，Mistral-7B-v0.3有以下變化：
- 詞彙表擴展到32768
- 支持v3分詞器
- 支持函數調用

📦 安裝指南

安裝必要的包，使用以下命令：

pip install --upgrade autoawq autoawq-kernels

💻 使用示例

基礎用法

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer, TextStreamer

model_path = "solidrust/Mistral-7B-Instruct-v0.3-AWQ"
system_message = "You are Mistral-7B-Instruct-v0.3, incarnated as a powerful AI. You were created by mistralai."

# Load model
model = AutoAWQForCausalLM.from_quantized(model_path,
                                          fuse_layers=True)
tokenizer = AutoTokenizer.from_pretrained(model_path,
                                          trust_remote_code=True)
streamer = TextStreamer(tokenizer,
                        skip_prompt=True,
                        skip_special_tokens=True)

# Convert prompt to tokens
prompt_template = """\
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant"""

prompt = "You're standing on the surface of the Earth. "\
        "You walk one mile south, one mile west and one mile north. "\
        "You end up exactly where you started. Where are you?"

tokens = tokenizer(prompt_template.format(system_message=system_message,prompt=prompt),
                  return_tensors='pt').input_ids.cuda()

# Generate output
generation_output = model.generate(tokens,
                                  streamer=streamer,
                                  max_new_tokens=512)

關於AWQ

AWQ是一種高效、準確且極快的低比特權重量化方法，目前支持4位量化。與GPTQ相比，在基於Transformer的推理中，它速度更快，並且在質量上與最常用的GPTQ設置相當或更好。

AWQ模型目前僅在Linux和Windows系統上支持，且僅支持NVIDIA GPU。macOS用戶請使用GGUF模型。

它得到以下工具的支持：

Text Generation Webui - 使用加載器：AutoAWQ
vLLM - 版本0.2.2及更高版本支持所有模型類型
Hugging Face Text Generation Inference (TGI)
Transformers 版本4.35.0及更高版本，任何支持Transformers的代碼或客戶端
AutoAWQ - 用於Python代碼

📚 詳細文檔

屬性	詳情
基礎模型	mistralai/Mistral-7B-Instruct-v0.3
推理	否
庫名稱	transformers
許可證	apache-2.0
任務類型	文本生成
量化者	Suparious
標籤	4-bit、AWQ、文本生成、autotrain_compatible、endpoints_compatible