C4ai Command R Plus

Developed by CohereLabs

Command R+ is a research version model with open weights and 104 billion parameters launched by Cohere Labs. It has retrieval-augmented generation (RAG) and tool usage capabilities, and supports multilingual and multi-step task automation.

Large Language Model

Transformers

Supports Multiple Languages#104 billion parameter large model #Multilingual RAG support #Long context reasoning

Downloads 8,002

Release Time : 4/3/2024

Model Overview

Command R+ is a multilingual large language model optimized for 10 languages, supporting the automated processing of complex tasks, including retrieval-augmented generation and tool usage.

Model Features

Multilingual support

Performance optimized for 10 languages, including major languages such as English, Chinese, and Arabic.

Retrieval-augmented generation (RAG)

Capable of more accurate generation by combining external information sources.

Tool usage ability

Supports multi-step tool usage and can combine multiple tools to complete complex tasks step by step.

Long context processing

Supports a context length of up to 128K.

Model Capabilities

Text generation

Question answering system

Text summarization

Multilingual processing

Complex task automation

Retrieval-augmented generation

Tool usage

Use Cases

Information processing

Multilingual question answering system

Build an intelligent question answering system supporting multiple languages

Highly accurate multilingual answers

Document summarization

Automatically generate concise summaries of long documents

Concise summaries retaining key information

Automated tasks

Complex workflow automation

Automate multi-step business processes through tool usage ability

Reduce manual intervention and improve efficiency

🚀 Cohere Labs Command R+ Model Card

Cohere Labs Command R+ is a 104B parameter model with advanced capabilities. It supports Retrieval Augmented Generation (RAG) and multi - step tool use, excelling in tasks like reasoning, summarization, and question - answering. Evaluated in 10 languages, it's optimized for various real - world applications.

🚀 Quick Start

You can try out Cohere Labs Command R+ before downloading the weights in our hosted Hugging Face Space.

Please install transformers from the source repository that includes the necessary changes for this model.

# pip install 'git+https://github.com/huggingface/transformers.git'
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereLabs/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

✨ Features

Advanced Capabilities: Cohere Labs Command R+ has highly advanced capabilities, including Retrieval Augmented Generation (RAG) and multi - step tool use to automate sophisticated tasks.
Multilingual Support: Evaluated in 10 languages (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese), it can handle diverse language needs.
Optimized for Use Cases: Optimized for reasoning, summarization, and question - answering, making it suitable for a variety of real - world applications.
Grounded Generation: Specifically trained with grounded generation capabilities, it can generate responses based on supplied document snippets and include citations.

📦 Installation

Please install transformers from the source repository that includes the necessary changes for this model:

pip install 'git+https://github.com/huggingface/transformers.git'

💻 Usage Examples

Basic Usage

# pip install 'git+https://github.com/huggingface/transformers.git'
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereLabs/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

Advanced Usage - Quantized model through bitsandbytes, 8 - bit precision

# pip install 'git+https://github.com/huggingface/transformers.git' bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(load_in_8bit=True)

model_id = "CohereLabs/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config)

# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

📚 Documentation

Model Details

Input: The model takes text as input.
Output: It generates text as output.
Model Architecture: It is an auto - regressive language model using an optimized transformer architecture. After pretraining, it uses supervised fine - tuning (SFT) and preference training to align with human preferences for helpfulness and safety.
Languages covered: Optimized for 10 languages (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese). Pre - training data also included 13 other languages (Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian).
Context length: Command R+ supports a context length of 128K.

Evaluations

Command R+ has been submitted to the Open LLM leaderboard. The evaluation results are as follows:

Model	Average	Arc (Challenge)	Hella Swag	MMLU	Truthful QA	Winogrande	GSM8k
CohereLabs/c4ai-command-r-plus	74.6	70.99	88.6	75.7	56.3	85.4	70.7
DBRX Instruct	74.5	68.9	89	73.7	66.9	81.8	66.9
Mixtral 8x7B-Instruct	72.7	70.1	87.6	71.4	65	81.1	61.1
Mixtral 8x7B Chat	72.6	70.2	87.6	71.2	64.6	81.4	60.7
CohereLabs/c4ai-command-r-v01	68.5	65.5	87	68.2	52.3	81.5	56.6
Llama 2 70B	67.9	67.3	87.3	69.8	44.9	83.7	54.1
Yi-34B-Chat	65.3	65.4	84.2	74.9	55.4	80.1	31.9
Gemma-7B	63.8	61.1	82.2	64.6	44.8	79	50.9
LLama 2 70B Chat	62.4	64.6	85.9	63.9	52.8	80.5	26.7
Mistral-7B-v0.1	61	60	83.3	64.2	42.2	78.4	37.8

Note that these metrics do not fully capture RAG, multilingual, tooling performance or the evaluation of open - ended generations. For more evaluations, refer to here for RAG, multilingual and tooling, and chatbot arena for open - ended generation.

Grounded Generation and RAG Capabilities

Command R+ has been specifically trained with grounded generation capabilities. It can generate responses based on supplied document snippets and include citations. The code example for rendering a grounded generation prompt is as follows:

Usage: Rendering Grounded Generation prompts [CLICK TO EXPAND]

```python from transformers import AutoTokenizer

model_id = "CohereLabs/c4ai-command-r-plus" tokenizer = AutoTokenizer.from_pretrained(model_id)

define conversation input:

conversation = [ {"role": "user", "content": "Whats the biggest penguin in the world?"} ]

define documents to ground on:

documents = [ { "title": "Tall penguins", "text": "Emperor penguins are the tallest growing up to 122 cm in height." }, { "title": "Penguin habitats", "text": "Emperor penguins only live in Antarctica."} ]

render the tool use prompt as a string:

grounded_generation_prompt = tokenizer.apply_grounded_generation_template( conversation, documents=documents, citation_mode="accurate", # or "fast" tokenize=False, add_generation_prompt=True, ) print(grounded_generation_prompt)

</details>

### Single - Step Tool Use Capabilities ("Function Calling")
[The original text did not provide complete content for this part. If there is more information, it can be further supplemented here.]

## 🔧 Technical Details
The model is an auto - regressive language model using an optimized transformer architecture. After pretraining, it uses supervised fine - tuning (SFT) and preference training to align with human preferences for helpfulness and safety. The grounded generation behavior is trained via a mixture of supervised fine - tuning and preference fine - tuning with a specific prompt template.

## 📄 License
The model is under the [CC - BY - NC](https://cohere.com/c4ai-cc-by-nc-license) license and requires adhering to [Cohere Lab's Acceptable Use Policy](https://docs.cohere.com/docs/c4ai-acceptable-use-policy).

> ⚠️ **Important Note**
> 
> This model is non - quantized version of Cohere Labs Command R+. You can find the quantized version of Cohere Labs Command R+ using bitsandbytes [here](https://huggingface.co/CohereLabs/c4ai-command-r-plus-4bit).

> 💡 **Usage Tip**
> 
> Deviating from the specific prompt template for grounded generation may reduce performance, but experimentation is encouraged.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご