🚀 Cohere Labs Command R+ 08-2024
Cohere Labs Command R+ 08-2024 is an open weights research release of a 104B - parameter model with advanced capabilities like Retrieval Augmented Generation (RAG) and tool - use, optimized for various use cases.
🚀 Quick Start
Model Summary
Cohere Labs Command R+ 08-2024 is an open weights research release of a 104 - billion - parameter model. It comes with highly advanced capabilities, including Retrieval Augmented Generation (RAG) and tool use for automating sophisticated tasks. The model's tool - use feature enables multi - step tool utilization, allowing it to combine multiple tools over multiple steps to achieve difficult tasks. It is a multilingual model trained on 23 languages and evaluated in 10 languages, and is optimized for use cases such as reasoning, summarization, and question answering.
This model is part of a family of open - weight releases from Cohere Labs and Cohere. Its smaller companion model is Cohere Labs Command R 08-2024.
Try Cohere Labs Command R+
You can try out Cohere Labs Command R+ before downloading the weights in our hosted Hugging Face Space.
✨ Features
Grounded Generation and RAG Capabilities
Command R+ 08-2024 has been specifically trained with grounded generation capabilities. It can generate responses based on a list of supplied document snippets and will include grounding spans (citations) in its response to indicate the source of the information. This can be used for grounded summarization and the final step of Retrieval Augmented Generation (RAG). This behavior is trained via a mixture of supervised fine - tuning and preference fine - tuning using a specific prompt template. Deviating from this prompt template may reduce performance, but experimentation is encouraged.
The model's grounded generation behavior takes a conversation (with an optional user - supplied system preamble indicating task, context, and desired output style) and a list of retrieved document snippets as input. The document snippets should be chunks, usually around 100 - 400 words per chunk, and consist of key - value pairs.
By default, it will generate grounded responses by first predicting relevant documents, then predicting citations, generating an answer, and finally inserting grounding spans. This is called accurate
grounded generation.
The model also supports a fast
citation mode in the tokenizer, which directly generates an answer with grounding spans, sacrificing some grounding accuracy for fewer generated tokens.
Comprehensive documentation for working with Command R+ 08-2024's grounded generation prompt template can be found here, here, and here.
📦 Installation
Please use transformers
version 4.39.1 or higher.
pip install 'transformers>=4.39.1'
💻 Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "CohereLabs/c4ai-command-r-plus-08-2024"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
Advanced Usage
Usage: Rendering Grounded Generation prompts [CLICK TO EXPAND]
from transformers import AutoTokenizer
model_id = "CohereLabs/c4ai-command-r-plus-08-2024"
tokenizer = AutoTokenizer.from_pretrained(model_id)
conversation = [
{"role": "user", "content": "Whats the biggest penguin in the world?"}
]
documents = [
{ "title": "Tall penguins", "text": "Emperor penguins are the tallest growing up to 122 cm in height." },
{ "title": "Penguin habitats", "text": "Emperor penguins only live in Antarctica."}
]
grounded_generation_prompt = tokenizer.apply_grounded_generation_template(
conversation,
documents=documents,
citation_mode="accurate",
tokenize=False,
add_generation_prompt=True,
)
print(grounded_generation_prompt)
📚 Documentation
Model Details
Input: The model only takes text as input.
Output: The model only generates text.
Model Architecture: It is an auto - regressive language model using an optimized transformer architecture. After pretraining, it uses supervised fine - tuning (SFT) and preference training to align model behavior with human preferences for helpfulness and safety. Grouped query attention (GQA) is used to improve inference speed.
Languages covered: The model has been trained on 23 languages (English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Simplified Chinese, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian) and evaluated on 10 languages (English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Simplified Chinese).
Context length: Command R+ 08-2024 supports a context length of 128K.
🔧 Technical Details
The model's advanced capabilities such as grounded generation and multi - step tool use are achieved through a combination of supervised fine - tuning and preference training. The use of grouped query attention (GQA) in the architecture significantly improves the inference speed. The specific prompt template for grounded generation is carefully designed to train the model to generate accurate and useful responses with proper citations.
📄 License
The model is licensed under CC - BY - NC and requires adhering to Cohere Lab's Acceptable Use Policy.
Additional Information
- License: cc - by - nc - 4.0
- Inference: false
- Library Name: transformers
- Supported Languages: en, fr, de, es, it, pt, ja, ko, zh, ar
- Extra Gated Prompt: "By submitting this form, you agree to the License Agreement and acknowledge that the information you provide will be collected, used, and shared in accordance with Cohere’s Privacy Policy. You’ll receive email updates about Cohere Labs and Cohere research, events, products and services. You can unsubscribe at any time."
- Extra Gated Fields:
- Name: text
- Affiliation: text
- Country:
- Type: select
- Options: A long list of countries (as provided in the original document)
- Usage Agreement: I agree to use this model for non - commercial use ONLY: checkbox