Mixtral-8x22B-Instruct-v0.1開源大語言模型 - 支持多語言及函數調用功能

首頁

Mixtral 8x22B Instruct V0.1

由mistralai開發

Mixtral-8x22B-Instruct-v0.1是基於Mixtral-8x22B-v0.1進行指令微調的大語言模型，支持多種語言和函數調用功能。

大型語言模型

Transformers

支持多種語言開源協議:Apache-2.0 #多專家混合模型 #多語言指令微調 #220B參數規模

下載量 12.80k

發布時間 : 4/16/2024

模型概述

這是一個基於Mixtral-8x22B架構的指令微調大語言模型，專門優化了對話和指令跟隨能力，支持多種編程語言接口和工具調用功能。

模型特點

多專家模型架構

採用8個專家模型的混合架構，每個輸入token動態選擇2個專家進行處理，提高模型效率

多語言支持

原生支持英語、西班牙語、意大利語、德語和法語等多種語言

函數調用能力

支持工具調用和函數執行，可集成外部API和工具

高效推理

儘管模型規模大，但通過專家混合架構實現了相對高效的推理

模型能力

文本生成

對話系統

指令跟隨

多語言處理

函數調用

工具集成

使用案例

對話系統

智能助手

構建多語言智能助手，處理用戶查詢和任務

能夠理解複雜指令並提供準確響應

開發者工具

API集成

通過函數調用能力集成外部API和服務

實現動態數據獲取和處理

教育

多語言學習助手

幫助學生學習多種語言的概念和表達

🚀 Mixtral-8x22B-Instruct-v0.1模型卡片

Mixtral-8x22B-Instruct-v0.1大語言模型（LLM）是Mixtral-8x22B-v0.1的指令微調版本。

屬性	詳情
支持語言	英語、西班牙語、意大利語、德語、法語
許可證	Apache-2.0
基礎模型	mistralai/Mixtral-8x22B-v0.1

⚠️ 重要提示

如果您想了解更多關於我們如何處理您的個人數據的信息，請閱讀我們的隱私政策。

🚀 快速開始

💻 使用示例

基礎用法

以下是使用mistral_common進行編碼和解碼的示例：

from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest
 
mistral_models_path = "MISTRAL_MODELS_PATH"
 
tokenizer = MistralTokenizer.v3()
 
completion_request = ChatCompletionRequest(messages=[UserMessage(content="Explain Machine Learning to me in a nutshell.")])
 
tokens = tokenizer.encode_chat_completion(completion_request).tokens

高級用法

以下是使用mistral_inference進行推理的示例：

from mistral_inference.transformer import Transformer
from mistral_inference.generate import generate
 
model = Transformer.from_folder(mistral_models_path)
out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)

result = tokenizer.decode(out_tokens[0])

print(result)

使用Hugging Face `transformers`準備輸入

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1")

chat = [{"role": "user", "content": "Explain Machine Learning to me in a nutshell."}]

tokens = tokenizer.apply_chat_template(chat, return_dict=True, return_tensors="pt", add_generation_prompt=True)

使用Hugging Face `transformers`進行推理

from transformers import AutoModelForCausalLM
import torch

# You can also use 8-bit or 4-bit quantization here
model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1", torch_dtype=torch.bfloat16, device_map="auto")
model.to("cuda")
 
generated_ids = model.generate(**tokens, max_new_tokens=1000, do_sample=True)

# decode with HF tokenizer
result = tokenizer.decode(generated_ids[0])
print(result)

函數調用示例

from transformers import AutoModelForCausalLM
from mistral_common.protocol.instruct.messages import (
    AssistantMessage,
    UserMessage,
)
from mistral_common.protocol.instruct.tool_calls import (
    Tool,
    Function,
)
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.normalize import ChatCompletionRequest

device = "cuda" # the device to load the model onto

tokenizer_v3 = MistralTokenizer.v3()

mistral_query = ChatCompletionRequest(
    tools=[
        Tool(
            function=Function(
                name="get_current_weather",
                description="Get the current weather",
                parameters={
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "format": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"],
                            "description": "The temperature unit to use. Infer this from the users location.",
                        },
                    },
                    "required": ["location", "format"],
                },
            )
        )
    ],
    messages=[
        UserMessage(content="What's the weather like today in Paris"),
    ],
    model="test",
)

encodeds = tokenizer_v3.encode_chat_completion(mistral_query).tokens
model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1")
model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
sp_tokenizer = tokenizer_v3.instruct_tokenizer.tokenizer
decoded = sp_tokenizer.decode(generated_ids[0])
print(decoded)

使用`transformers`進行函數調用

要使用此示例，您需要transformers版本4.42.0或更高版本。有關更多信息，請參閱transformers文檔中的函數調用指南。

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "mistralai/Mixtral-8x22B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)

def get_current_weather(location: str, format: str):
    """
    Get the current weather

    Args:
        location: The city and state, e.g. San Francisco, CA
        format: The temperature unit to use. Infer this from the users location. (choices: ["celsius", "fahrenheit"])
    """
    pass

conversation = [{"role": "user", "content": "What's the weather like in Paris?"}]
tools = [get_current_weather]

# format and tokenize the tool use prompt 
inputs = tokenizer.apply_chat_template(
            conversation,
            tools=tools,
            add_generation_prompt=True,
            return_dict=True,
            return_tensors="pt",
)

model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

inputs.to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1000)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ 重要提示

由於篇幅原因，此示例未展示調用工具並將工具調用和工具結果添加到聊天曆史記錄的完整循環，以便模型在下次生成時使用它們。有關完整的工具調用示例，請參閱函數調用指南，並注意Mixtral 確實使用工具調用ID，因此這些ID必須包含在您的工具調用和工具結果中。它們應該正好是9個字母數字字符。

指令分詞器

此版本中包含的HuggingFace分詞器應與我們自己的分詞器匹配。您可以通過以下命令進行比較： pip install mistral-common

from mistral_common.protocol.instruct.messages import (
    AssistantMessage,
    UserMessage,
)
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.normalize import ChatCompletionRequest

from transformers import AutoTokenizer

tokenizer_v3 = MistralTokenizer.v3()

mistral_query = ChatCompletionRequest(
    messages=[
        UserMessage(content="How many experts ?"),
        AssistantMessage(content="8"),
        UserMessage(content="How big ?"),
        AssistantMessage(content="22B"),
        UserMessage(content="Noice 🎉 !"),
    ],
    model="test",
)
hf_messages = mistral_query.model_dump()['messages']

tokenized_mistral = tokenizer_v3.encode_chat_completion(mistral_query).tokens

tokenizer_hf = AutoTokenizer.from_pretrained('mistralai/Mixtral-8x22B-Instruct-v0.1')
tokenized_hf = tokenizer_hf.apply_chat_template(hf_messages, tokenize=True)

assert tokenized_hf == tokenized_mistral

函數調用和特殊標記

此分詞器包含更多與函數調用相關的特殊標記：

[TOOL_CALLS]
[AVAILABLE_TOOLS]
[/AVAILABLE_TOOLS]
[TOOL_RESULTS]
[/TOOL_RESULTS]

如果您想在函數調用中使用此模型，請確保以類似於我們在SentencePieceTokenizerV3中所做的方式應用它。

米斯特拉爾AI團隊

Albert Jiang、Alexandre Sablayrolles、Alexis Tacnet、Antoine Roux、 Arthur Mensch、Audrey Herblin - Stoop、Baptiste Bout、Baudouin de Monicault、 Blanche Savary、Bam4d、Caroline Feldman、Devendra Singh Chaplot、 Diego de las Casas、Eleonore Arcelin、Emma Bou Hanna、Etienne Metzger、 Gianna Lengyel、Guillaume Bour、Guillaume Lample、Harizo Rajaona、 Jean - Malo Delignon、Jia Li、Justus Murke、Louis Martin、Louis Ternon、 Lucile Saulnier、Lélio Renard Lavaud、Margaret Jennings、Marie Pellat、 Marie Torelli、Marie - Anne Lachaux、Nicolas Schuhl、Patrick von Platen、 Pierre Stock、Sandeep Subramanian、Sophia Yang、Szymon Antoniak、Teven Le Scao、 Thibaut Lavril、Timothée Lacroix、Théophile Gervet、Thomas Wang、 Valera Nemychnikova、William El Sayed、William Marshall