đ Cohere Labs Command R+
Cohere Labs Command R+ is a 104B parameter model with advanced capabilities. It supports Retrieval Augmented Generation (RAG) and tool use for complex tasks. Evaluated in 10 languages, it's optimized for reasoning, summarization, and question - answering.
đ Quick Start
Try Cohere Labs Command R+
You can try out Cohere Labs Command R+ before downloading the weights in our hosted Hugging Face Space.
Usage
Please install transformers
from the source repository that includes the necessary changes for this model.
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
⨠Features
- Multilingual Capability: Evaluated in 10 languages (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese), with pre - training data including 13 additional languages.
- Tool Use: Supports multi - step tool use, allowing the model to combine multiple tools over multiple steps to accomplish difficult tasks.
- Grounded Generation and RAG: Can generate responses based on supplied document snippets and include grounding spans (citations) in its response.
đĻ Installation
Please install transformers
from the source repository that includes the necessary changes for this model.
pip install 'git+https://github.com/huggingface/transformers.git' bitsandbytes accelerate
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
Advanced Usage - Tool Use
from transformers import AutoTokenizer
model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
conversation = [
{"role": "user", "content": "Whats the biggest penguin in the world?"}
]
tools = [
{
"name": "internet_search",
"description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
"parameter_definitions": {
"query": {
"description": "Query to search the internet with",
"type": 'str',
"required": True
}
}
},
{
'name': "directly_answer",
"description": "Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history",
'parameter_definitions': {}
}
]
tool_use_prompt = tokenizer.apply_tool_use_template(
conversation,
tools=tools,
tokenize=False,
add_generation_prompt=True,
)
print(tool_use_prompt)
Advanced Usage - Grounded Generation
from transformers import AutoTokenizer
model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
conversation = [
{"role": "user", "content": "Whats the biggest penguin in the world?"}
]
documents = [
{ "title": "Tall penguins", "text": "Emperor penguins are the tallest growing up to 122 cm in height." },
{ "title": "Penguin habitats", "text": "Emperor penguins only live in Antarctica."}
]
grounded_generation_prompt = tokenizer.apply_grounded_generation_template(
conversation,
documents=documents,
citation_mode="accurate",
tokenize=False,
add_generation_prompt=True,
)
print(grounded_generation_prompt)
đ Documentation
Model Details
Property |
Details |
Model Type |
Auto - regressive language model using an optimized transformer architecture |
Training Data |
Pre - training data includes 23 languages (10 for evaluation and 13 additional) |
Model Size |
104 billion parameters |
Context length |
128K |
Input: The model takes text as input.
Output: The model generates text as output.
Model Architecture: It's an auto - regressive language model with an optimized transformer architecture. After pretraining, it uses supervised fine - tuning (SFT) and preference training to align with human preferences.
Tool use & multihop capabilities
Command R+ has been specifically trained with conversational tool use capabilities. It takes a conversation and a list of available tools as input and generates a json - formatted list of actions. Comprehensive documentation for working with command R+'s tool use prompt template can be found here. It also supports Hugging Face's tool use API.
Grounded Generation and RAG Capabilities
Command R+ can generate responses based on supplied document snippets and include grounding spans (citations). This behavior has been trained using a specific prompt template. Comprehensive documentation for working with Command R+'s grounded generation prompt template can be found here.
đ§ Technical Details
- Tool Use Training: Command R+ is trained for tool use via a mixture of supervised fine - tuning and preference fine - tuning, using a specific prompt template.
- Grounded Generation Training: Grounded generation capabilities are also trained through a combination of supervised fine - tuning and preference fine - tuning with a specific prompt template.
đ License
â ī¸ Important Note
This model is a 4bit quantized version of Cohere Labs Command R+ using bitsandbytes. You can find the unquantized version of Cohere Labs Command R+ here.
đĄ Usage Tip
When using the tool use and grounded generation capabilities, following the specific prompt templates is recommended to achieve better performance, but experimentation is also encouraged.