đ Cohere Labs Command R+ Model Card
Cohere Labs Command R+ is a 104B parameter model with advanced capabilities. It supports Retrieval Augmented Generation (RAG) and multi - step tool use, excelling in tasks like reasoning, summarization, and question - answering. Evaluated in 10 languages, it's optimized for various real - world applications.
đ Quick Start
You can try out Cohere Labs Command R+ before downloading the weights in our hosted Hugging Face Space.
Please install transformers
from the source repository that includes the necessary changes for this model.
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "CohereLabs/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
⨠Features
- Advanced Capabilities: Cohere Labs Command R+ has highly advanced capabilities, including Retrieval Augmented Generation (RAG) and multi - step tool use to automate sophisticated tasks.
- Multilingual Support: Evaluated in 10 languages (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese), it can handle diverse language needs.
- Optimized for Use Cases: Optimized for reasoning, summarization, and question - answering, making it suitable for a variety of real - world applications.
- Grounded Generation: Specifically trained with grounded generation capabilities, it can generate responses based on supplied document snippets and include citations.
đĻ Installation
Please install transformers
from the source repository that includes the necessary changes for this model:
pip install 'git+https://github.com/huggingface/transformers.git'
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "CohereLabs/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
Advanced Usage - Quantized model through bitsandbytes, 8 - bit precision
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
bnb_config = BitsAndBytesConfig(load_in_8bit=True)
model_id = "CohereLabs/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config)
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
đ Documentation
Model Details
- Input: The model takes text as input.
- Output: It generates text as output.
- Model Architecture: It is an auto - regressive language model using an optimized transformer architecture. After pretraining, it uses supervised fine - tuning (SFT) and preference training to align with human preferences for helpfulness and safety.
- Languages covered: Optimized for 10 languages (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese). Pre - training data also included 13 other languages (Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian).
- Context length: Command R+ supports a context length of 128K.
Evaluations
Command R+ has been submitted to the Open LLM leaderboard. The evaluation results are as follows:
Note that these metrics do not fully capture RAG, multilingual, tooling performance or the evaluation of open - ended generations. For more evaluations, refer to here for RAG, multilingual and tooling, and chatbot arena for open - ended generation.
Grounded Generation and RAG Capabilities
Command R+ has been specifically trained with grounded generation capabilities. It can generate responses based on supplied document snippets and include citations. The code example for rendering a grounded generation prompt is as follows:
Usage: Rendering Grounded Generation prompts [CLICK TO EXPAND]
```python
from transformers import AutoTokenizer
model_id = "CohereLabs/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
define conversation input:
conversation = [
{"role": "user", "content": "Whats the biggest penguin in the world?"}
]
define documents to ground on:
documents = [
{ "title": "Tall penguins", "text": "Emperor penguins are the tallest growing up to 122 cm in height." },
{ "title": "Penguin habitats", "text": "Emperor penguins only live in Antarctica."}
]
render the tool use prompt as a string:
grounded_generation_prompt = tokenizer.apply_grounded_generation_template(
conversation,
documents=documents,
citation_mode="accurate", # or "fast"
tokenize=False,
add_generation_prompt=True,
)
print(grounded_generation_prompt)
</details>
### Single - Step Tool Use Capabilities ("Function Calling")
[The original text did not provide complete content for this part. If there is more information, it can be further supplemented here.]
## đ§ Technical Details
The model is an auto - regressive language model using an optimized transformer architecture. After pretraining, it uses supervised fine - tuning (SFT) and preference training to align with human preferences for helpfulness and safety. The grounded generation behavior is trained via a mixture of supervised fine - tuning and preference fine - tuning with a specific prompt template.
## đ License
The model is under the [CC - BY - NC](https://cohere.com/c4ai-cc-by-nc-license) license and requires adhering to [Cohere Lab's Acceptable Use Policy](https://docs.cohere.com/docs/c4ai-acceptable-use-policy).
> â ī¸ **Important Note**
>
> This model is non - quantized version of Cohere Labs Command R+. You can find the quantized version of Cohere Labs Command R+ using bitsandbytes [here](https://huggingface.co/CohereLabs/c4ai-command-r-plus-4bit).
> đĄ **Usage Tip**
>
> Deviating from the specific prompt template for grounded generation may reduce performance, but experimentation is encouraged.