開源c4ai-command-r-plus-4bit模型 - 免費支持多語言複雜任務自動化

首頁

C4ai Command R Plus 4bit

由CohereForAI開發

Command R+是Cohere實驗室開發的1040億參數開放權重研究級模型，支持多語言和複雜任務自動化。

大型語言模型

Transformers

支持多種語言#1040億參數大模型 #多語言RAG優化 #128K長上下文

下載量 707

發布時間 : 4/3/2024

模型概述

Command R+是一個具備檢索增強生成(RAG)和工具使用能力的多語言大模型，針對10種語言優化性能，支持128K上下文長度。

模型特點

多語言支持

針對10種主要語言優化性能，支持13種額外語言的理解和生成

長上下文處理

支持128K超長上下文窗口，適合處理複雜文檔和長對話

工具調用能力

支持多步驟工具調用，可組合多個工具完成複雜任務

檢索增強生成

內置RAG能力，可結合外部知識源生成更準確的回答

模型能力

文本生成

多語言對話

工具調用

多跳推理

長文檔處理

摘要生成

問答系統

使用案例

企業應用

客戶支持

多語言客戶諮詢自動回覆

提高響應速度和服務質量

文檔分析

長文檔摘要和關鍵信息提取

提升文檔處理效率

開發者工具

自動化工作流

通過工具調用實現複雜任務自動化

減少手動操作步驟

🚀 Cohere Labs Command R+模型介紹

Cohere Labs Command R+ 是一款擁有 1040 億參數的模型，具備高度先進的能力，如檢索增強生成（RAG）和工具使用，可自動化完成複雜任務。該模型支持多語言，在推理、摘要和問答等多種用例中表現出色。

🚀 快速開始

你可以在下載權重之前，在我們託管的 Hugging Face 空間中試用 Cohere Labs Command R+。

請從包含此模型必要更改的源存儲庫安裝 transformers：

# pip install 'git+https://github.com/huggingface/transformers.git' bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

✨ 主要特性

模型概述

Cohere Labs Command R+ 是一個 1040 億參數模型的開放權重研究版本，具有高度先進的能力，包括檢索增強生成（RAG）和工具使用，可自動化完成複雜任務。該模型支持多步工具使用，允許模型在多個步驟中組合多個工具以完成困難任務。它是一個多語言模型，在 10 種語言中進行了性能評估：英語、法語、西班牙語、意大利語、德語、巴西葡萄牙語、日語、韓語、阿拉伯語和簡體中文。Command R+ 針對推理、摘要和問答等多種用例進行了優化。

Cohere Labs Command R+ 是 Cohere For AI 和 Cohere 開放權重版本系列的一部分。我們較小的配套模型是 Cohere Labs Command R。

模型詳情

輸入：模型僅接受文本輸入。
輸出：模型僅生成文本輸出。
模型架構：這是一個自迴歸語言模型，使用了優化的 Transformer 架構。預訓練後，該模型使用監督微調（SFT）和偏好訓練，使模型行為與人類對有用性和安全性的偏好保持一致。
支持語言：該模型針對以下語言進行了優化：英語、法語、西班牙語、意大利語、德語、巴西葡萄牙語、日語、韓語、簡體中文和阿拉伯語。預訓練數據還額外包含以下 13 種語言：俄語、波蘭語、土耳其語、越南語、荷蘭語、捷克語、印尼語、烏克蘭語、羅馬尼亞語、希臘語、印地語、希伯來語和波斯語。
上下文長度：Command R+ 支持 128K 的上下文長度。

工具使用和多跳能力

Command R+ 經過專門訓練，具備對話工具使用能力。這些能力通過監督微調與偏好微調的混合方式，使用特定的提示模板訓練到模型中。偏離此提示模板可能會降低性能，但我們鼓勵進行實驗。

Command R+ 的工具使用功能將對話（可選包含用戶 - 系統前言）和可用工具列表作為輸入。然後，模型將生成一個 JSON 格式的操作列表，以便在部分工具上執行。Command R+ 可能會多次使用其提供的工具之一。

該模型經過訓練，能夠識別特殊的 directly_answer 工具，用於表明它不想使用其他任何工具。在某些情況下，如問候用戶或詢問澄清問題時，不調用特定工具的能力可能會很有用。我們建議包含 directly_answer 工具，但如果需要，也可以將其刪除或重命名。

有關使用 Command R+ 工具使用提示模板的詳細文檔，請參閱此處。

Command R+ 還支持 Hugging Face 的工具使用 API。

以下是一個如何渲染提示的最小工作示例：

用法：渲染工具使用提示 [點擊展開]

from transformers import AutoTokenizer

model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)

# define conversation input:
conversation = [
    {"role": "user", "content": "Whats the biggest penguin in the world?"}
]
# Define tools available for the model to use:
tools = [
  {
    "name": "internet_search",
    "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
    "parameter_definitions": {
      "query": {
        "description": "Query to search the internet with",
        "type": 'str',
        "required": True
      }
    }
  },
  {
    'name': "directly_answer",
    "description": "Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history",
    'parameter_definitions': {}
  }
]

# render the tool use prompt as a string:
tool_use_prompt = tokenizer.apply_tool_use_template(
    conversation,
    tools=tools,
    tokenize=False,
    add_generation_prompt=True,
)
print(tool_use_prompt)

用法：使用工具使用 API 渲染提示 [點擊展開]

from transformers import AutoTokenizer

model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)

# define conversation input:
conversation = [
    {"role": "user", "content": "Whats the biggest penguin in the world?"}
]

# Define tools available for the model to use
# Type hints and docstrings from Python functions are automatically extracted
def internet_search(query: str):
    """
    Returns a list of relevant document snippets for a textual query retrieved from the internet

    Args:
        query: Query to search the internet with
    """
    pass

def directly_answer():
    """
    Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history
    """
    pass

tools = [internet_search, directly_answer]

# render the tool use prompt as a string:
tool_use_prompt = tokenizer.apply_chat_template(
    conversation,
    tools=tools,
    tokenize=False,
    add_generation_prompt=True,
)
print(tool_use_prompt)

示例渲染的工具使用提示 [點擊展開]

<BOS_TOKEN><|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|># Safety Preamble
The instructions in this section override those in the task description and style guide sections. Don't answer questions that are harmful or immoral.

# System Preamble
## Basic Rules
You are a powerful conversational AI trained by Cohere to help people. You are augmented by a number of tools, and your job is to use and consume the output of these tools to best help the user. You will see a conversation history between yourself and a user, ending with an utterance from the user. You will then see a specific instruction instructing you what kind of response to generate. When you answer the user's requests, you cite your sources in your answers, according to those instructions.

# User Preamble
## Task and Context
You help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging.

## Style Guide
Unless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.

## Available Tools
Here is a list of tools that you have available to you:

```python
def internet_search(query: str) -> List[Dict]:
    """Returns a list of relevant document snippets for a textual query retrieved from the internet

    Args:
        query (str): Query to search the internet with
    """
    pass
```

```python
def directly_answer() -> List[Dict]:
    """Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history
    """
    pass
```<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Whats the biggest penguin in the world?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>Write 'Action:' followed by a json-formatted list of actions that you want to perform in order to produce a good response to the user's last input. You can use any of the supplied tools any number of times, but you should aim to execute the minimum number of necessary actions for the input. You should use the `directly-answer` tool if calling the other tools is unnecessary. The list of actions you want to call should be formatted as a list of json objects, for example:
```json
[
    {
        "tool_name": title of the tool in the specification,
        "parameters": a dict of parameters to input into the tool as they are defined in the specs, or {} if it takes no parameters
    }
]```<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

示例渲染的工具使用完成結果 [點擊展開]

Action: ```json
[
      {
          "tool_name": "internet_search",
          "parameters": {
              "query": "biggest penguin in the world"
          }
      }
]
```

基於文檔的生成和 RAG 能力

Command R+ 經過專門訓練，具備基於文檔的生成能力。這意味著它可以根據提供的文檔片段列表生成響應，並在響應中包含引用範圍（引用），以指示信息的來源。這可用於實現基於文檔的摘要和檢索增強生成（RAG）的最後一步等行為。這種行為通過監督微調與偏好微調的混合方式，使用特定的提示模板訓練到模型中。偏離此提示模板可能會降低性能，但我們鼓勵進行實驗。

Command R+ 的基於文檔的生成行為將對話（可選包含用戶提供的系統前言，指示任務、上下文和所需的輸出風格）和檢索到的文檔片段列表作為輸入。文檔片段應該是小塊，而不是長文檔，通常每個塊約 100 - 400 個單詞。文檔片段由鍵值對組成。鍵應該是簡短的描述性字符串，值可以是文本或半結構化數據。

默認情況下，Command R+ 將通過以下步驟生成基於文檔的響應：首先預測哪些文檔相關，然後預測將引用哪些文檔，接著生成答案，最後將引用範圍插入答案中。請參閱以下示例，這稱為 accurate 基於文檔的生成。

該模型還使用了多種其他回答模式進行訓練，可以通過更改提示進行選擇。分詞器支持 fast 引用模式，該模式將直接生成包含引用範圍的答案，而無需先完整寫出答案。這是以犧牲一些引用準確性為代價，以減少生成的標記數量。

有關使用 Command R+ 的基於文檔的生成提示模板的詳細文檔，請參閱此處。

以下是一個如何渲染提示的最小工作示例：

用法：渲染基於文檔的生成提示 [點擊展開]

from transformers import AutoTokenizer

model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)

# define conversation input:
conversation = [
    {"role": "user", "content": "Whats the biggest penguin in the world?"}
]
# define documents to ground on:
documents = [
    { "title": "Tall penguins", "text": "Emperor penguins are the tallest growing up to 122 cm in height." }, 
    { "title": "Penguin habitats", "text": "Emperor penguins only live in Antarctica."}
]

# render the tool use prompt as a string:
grounded_generation_prompt = tokenizer.apply_grounded_generation_template(
    conversation,
    documents=documents,
    citation_mode="accurate", # or "fast"
    tokenize=False,
    add_generation_prompt=True,
)
print(grounded_generation_prompt)

示例渲染的基於文檔的生成提示 [點擊展開]

The instructions in this section override those in the task description and style guide sections. Don't answer questions that are harmful or immoral.

# System Preamble
## Basic Rules
You are a powerful conversational AI trained by Cohere to help people. You are augmented by a number of tools, and your job is to use and consume the output of these tools to best help the user. You will see a conversation history between yourself and a user, ending with an utterance from the user. You will then see a specific instruction instructing you what kind of response to generate. When you answer the user's requests, you cite your sources in your answers, according to those instructions.

# User Preamble
## Task and Context
You help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging.

## Style Guide
Unless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Whats the biggest penguin in the world?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|><results>
Document: 0
title: Tall penguins
text: Emperor penguins are the tallest growing up to 122 cm in height.

Document: 1
title: Penguin habitats
text: Emperor penguins only live in Antarctica.
</results><|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>Carefully perform the following instructions, in order, starting each with a new line.
Firstly, Decide which of the retrieved documents are relevant to the user's last input by writing 'Relevant Documents:' followed by comma-separated list of document numbers. If none are relevant, you should instead write 'None'.
Secondly, Decide which of the retrieved documents contain facts that should be cited in a good answer to the user's last input by writing 'Cited Documents:' followed a comma-separated list of document numbers. If you dont want to cite any of them, you should instead write 'None'.
Thirdly, Write 'Answer:' followed by a response to the user's last input in high quality natural english. Use the retrieved documents to help you. Do not insert any citations or grounding markup.
Finally, Write 'Grounded answer:' followed by a response to the user's last input in high quality natural english. Use the symbols <co: doc> and </co: doc> to indicate when a fact comes from a document in the search result, e.g <co: 0>my fact</co: 0> for a fact from document 0.<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

示例渲染的基於文檔的生成完成結果 [點擊展開]

Relevant Documents: 0,1
Cited Documents: 0,1
Answer: The Emperor Penguin is the tallest or biggest penguin in the world. It is a bird that lives only in Antarctica and grows to a height of around 122 centimetres.
Grounded answer: The <co: 0>Emperor Penguin</co: 0> is the <co: 0>tallest</co: 0> or biggest penguin in the world. It is a bird that <co: 1>lives only in Antarctica</co: 1> and <co: 0>grows to a height of around 122 centimetres.</co: 0>

代碼能力

Command R+ 經過優化，可以通過請求代碼片段、代碼解釋或代碼重寫與你的代碼進行交互。對於純代碼補全，它可能無法直接表現出色。為了獲得更好的性能，我們還建議在與代碼生成相關的指令中使用較低的溫度（甚至貪婪解碼）。

📚 詳細文檔

模型信息

開發者：Cohere 和 Cohere For AI
聯繫方式：Cohere For AI: cohere.for.ai
許可證：CC - BY - NC，還需要遵守 Cohere Lab's Acceptable Use Policy
模型名稱：c4ai - command - r - plus
模型大小：1040 億參數
上下文長度：128K

試用模型

你可以在我們託管的 Hugging Face 空間中試用 Cohere Labs Command R+，然後再下載權重。你也可以在此處的遊樂場中試用 Command R+ 聊天功能。

📄 許可證

本模型受 CC - BY - NC 許可證約束，還需要遵守 Cohere Lab's Acceptable Use Policy。

⚠️ 重要提示

通過提交此表單，即表示你同意許可協議，並確認你提供的信息將根據 Cohere 的隱私政策進行收集、使用和共享。你將收到有關 Cohere Labs 和 Cohere 研究、活動、產品和服務的電子郵件更新。你可以隨時取消訂閱。

屬性	詳情
支持語言	英語、法語、西班牙語、意大利語、德語、巴西葡萄牙語、日語、韓語、阿拉伯語、簡體中文。預訓練數據還額外包含俄語、波蘭語、土耳其語、越南語、荷蘭語、捷克語、印尼語、烏克蘭語、羅馬尼亞語、希臘語、印地語、希伯來語和波斯語。
許可證	CC - BY - NC，需遵守 Cohere Lab's Acceptable Use Policy
模型類型	自迴歸語言模型，使用優化的 Transformer 架構
模型大小	1040 億參數
上下文長度	128K