Mistral-7B-Instruct-v0.3-AWQ开源大语言模型 - 免费部署高效问答支持

首页

Mistral 7B Instruct V0.3 AWQ

由 solidrust 开发

Mistral-7B-Instruct-v0.3是基于Mistral-7B-v0.3进行指令微调的大语言模型，采用4位AWQ量化技术优化推理效率

大型语言模型

Transformers

开源协议:Apache-2.0 #4位量化推理 #指令微调大模型 #函数调用支持

下载量 48.24k

发布时间 : 5/23/2024

模型简介

一个经过指令微调的大语言模型，支持函数调用功能，适用于文本生成任务

模型特点

4位AWQ量化

采用高效精准的4位权重量化技术，在保持模型质量的同时提升推理速度

扩展词表

词表扩展至32768个token，提升模型表达能力

函数调用支持

支持函数调用功能，增强模型实用性

多平台兼容

支持多种推理平台和框架，包括text-generation-webui、vLLM、Hugging Face TGI等

模型能力

文本生成

指令理解

函数调用

对话交互

使用案例

智能助手

问答系统

回答用户提出的各类问题

能提供准确、详细的回答

逻辑推理

解决复杂的逻辑和数学问题

可处理类似'南西北走'等空间推理问题

开发者工具

API集成

通过函数调用功能集成到应用程序中

支持开发者构建更智能的应用

🚀 mistralai/Mistral-7B-Instruct-v0.3 AWQ

本项目是Mistral-7B-Instruct-v0.3模型的AWQ量化版本，能高效进行文本生成任务，在特定硬件和软件环境下可快速推理。

🚀 快速开始

安装必要的包

pip install --upgrade autoawq autoawq-kernels

Python代码示例

基础用法

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer, TextStreamer

model_path = "solidrust/Mistral-7B-Instruct-v0.3-AWQ"
system_message = "You are Mistral-7B-Instruct-v0.3, incarnated as a powerful AI. You were created by mistralai."

# Load model
model = AutoAWQForCausalLM.from_quantized(model_path,
                                          fuse_layers=True)
tokenizer = AutoTokenizer.from_pretrained(model_path,
                                          trust_remote_code=True)
streamer = TextStreamer(tokenizer,
                        skip_prompt=True,
                        skip_special_tokens=True)

# Convert prompt to tokens
prompt_template = """\
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant"""

prompt = "You're standing on the surface of the Earth. "\
        "You walk one mile south, one mile west and one mile north. "\
        "You end up exactly where you started. Where are you?"

tokens = tokenizer(prompt_template.format(system_message=system_message,prompt=prompt),
                  return_tensors='pt').input_ids.cuda()

# Generate output
generation_output = model.generate(tokens,
                                  streamer=streamer,
                                  max_new_tokens=512)

✨ 主要特性

模型创建者：mistralai
原始模型：Mistral-7B-Instruct-v0.3
模型概要：Mistral-7B-Instruct-v0.3大语言模型（LLM）是Mistral-7B-v0.3的指令微调版本。与Mistral-7B-v0.2相比，Mistral-7B-v0.3有以下变化：
- 词汇表扩展到32768
- 支持v3分词器
- 支持函数调用

📦 安装指南

安装必要的包，使用以下命令：

pip install --upgrade autoawq autoawq-kernels

💻 使用示例

基础用法

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer, TextStreamer

model_path = "solidrust/Mistral-7B-Instruct-v0.3-AWQ"
system_message = "You are Mistral-7B-Instruct-v0.3, incarnated as a powerful AI. You were created by mistralai."

# Load model
model = AutoAWQForCausalLM.from_quantized(model_path,
                                          fuse_layers=True)
tokenizer = AutoTokenizer.from_pretrained(model_path,
                                          trust_remote_code=True)
streamer = TextStreamer(tokenizer,
                        skip_prompt=True,
                        skip_special_tokens=True)

# Convert prompt to tokens
prompt_template = """\
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant"""

prompt = "You're standing on the surface of the Earth. "\
        "You walk one mile south, one mile west and one mile north. "\
        "You end up exactly where you started. Where are you?"

tokens = tokenizer(prompt_template.format(system_message=system_message,prompt=prompt),
                  return_tensors='pt').input_ids.cuda()

# Generate output
generation_output = model.generate(tokens,
                                  streamer=streamer,
                                  max_new_tokens=512)

关于AWQ

AWQ是一种高效、准确且极快的低比特权重量化方法，目前支持4位量化。与GPTQ相比，在基于Transformer的推理中，它速度更快，并且在质量上与最常用的GPTQ设置相当或更好。

AWQ模型目前仅在Linux和Windows系统上支持，且仅支持NVIDIA GPU。macOS用户请使用GGUF模型。

它得到以下工具的支持：

Text Generation Webui - 使用加载器：AutoAWQ
vLLM - 版本0.2.2及更高版本支持所有模型类型
Hugging Face Text Generation Inference (TGI)
Transformers 版本4.35.0及更高版本，任何支持Transformers的代码或客户端
AutoAWQ - 用于Python代码

📚 详细文档

属性	详情
基础模型	mistralai/Mistral-7B-Instruct-v0.3
推理	否
库名称	transformers
许可证	apache-2.0
任务类型	文本生成
量化者	Suparious
标签	4-bit、AWQ、文本生成、autotrain_compatible、endpoints_compatible