Open-source Multilingual Large Model Command R+ 08-2024 - Supports Retrieval Augmentation and Facilitates Automation of Complex Tasks

C4ai Command R Plus 08 2024

Developed by CohereLabs

Command R+ 08-2024 is a 104 billion parameter multilingual large language model that supports Retrieval-Augmented Generation (RAG) and tool usage, suitable for automating complex tasks.

Large Language Model

Transformers

Supports Multiple Languages#104 billion parameter large model #Multilingual RAG support #128K long text processing

Downloads 4,265

Release Time : 8/21/2024

Model Overview

This is an open-weight 104 billion parameter research version model with highly advanced capabilities, including Retrieval-Augmented Generation (RAG) and tool usage to automate complex tasks. Supports multi-step tool usage, enabling the combination of various tools to accomplish difficult tasks step-by-step.

Model Features

Multilingual support

Trained on 23 languages and evaluated in 10 languages

Long context processing

Supports 128K context length

Retrieval-Augmented Generation

Specifically trained for document-based generation, capable of generating responses based on provided document fragments with citation markers

Tool usage capability

Supports multi-step tool usage, enabling the combination of various tools to accomplish difficult tasks step-by-step

Model Capabilities

Text generation

Question answering

Summarization

Retrieval-augmented generation

Multilingual processing

Tool usage

Complex task automation

Use Cases

Information retrieval and generation

Document-based Q&A

Answer user questions based on provided document fragments

Generates accurate responses with citation markers

Document summarization

Generate summaries based on multiple document fragments

Produces concise summaries containing key information

Task automation

Multi-step task execution

Combine multiple tools to complete complex tasks step-by-step

Automates multi-step task workflows

🚀 Cohere Labs Command R+ 08-2024

Cohere Labs Command R+ 08-2024 is an open weights research release of a 104B - parameter model with advanced capabilities like Retrieval Augmented Generation (RAG) and tool - use, optimized for various use cases.

🚀 Quick Start

Model Summary

Cohere Labs Command R+ 08-2024 is an open weights research release of a 104 - billion - parameter model. It comes with highly advanced capabilities, including Retrieval Augmented Generation (RAG) and tool use for automating sophisticated tasks. The model's tool - use feature enables multi - step tool utilization, allowing it to combine multiple tools over multiple steps to achieve difficult tasks. It is a multilingual model trained on 23 languages and evaluated in 10 languages, and is optimized for use cases such as reasoning, summarization, and question answering.

This model is part of a family of open - weight releases from Cohere Labs and Cohere. Its smaller companion model is Cohere Labs Command R 08-2024.

Point of Contact: Cohere Labs
License: CC - BY - NC, requires also adhering to Cohere Lab's Acceptable Use Policy
Model: coherelabs - command - r - plus - 08 - 2024
Model Size: 104 billion parameters
Context length: 128K

Try Cohere Labs Command R+

You can try out Cohere Labs Command R+ before downloading the weights in our hosted Hugging Face Space.

✨ Features

Grounded Generation and RAG Capabilities

Command R+ 08-2024 has been specifically trained with grounded generation capabilities. It can generate responses based on a list of supplied document snippets and will include grounding spans (citations) in its response to indicate the source of the information. This can be used for grounded summarization and the final step of Retrieval Augmented Generation (RAG). This behavior is trained via a mixture of supervised fine - tuning and preference fine - tuning using a specific prompt template. Deviating from this prompt template may reduce performance, but experimentation is encouraged.

The model's grounded generation behavior takes a conversation (with an optional user - supplied system preamble indicating task, context, and desired output style) and a list of retrieved document snippets as input. The document snippets should be chunks, usually around 100 - 400 words per chunk, and consist of key - value pairs.

By default, it will generate grounded responses by first predicting relevant documents, then predicting citations, generating an answer, and finally inserting grounding spans. This is called accurate grounded generation.

The model also supports a fast citation mode in the tokenizer, which directly generates an answer with grounding spans, sacrificing some grounding accuracy for fewer generated tokens.

Comprehensive documentation for working with Command R+ 08-2024's grounded generation prompt template can be found here, here, and here.

📦 Installation

Please use transformers version 4.39.1 or higher.

pip install 'transformers>=4.39.1'

💻 Usage Examples

Basic Usage

# pip install 'transformers>=4.39.1'
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereLabs/c4ai-command-r-plus-08-2024"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Format message with the command-r-plus-08-2024 chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

Advanced Usage

Usage: Rendering Grounded Generation prompts [CLICK TO EXPAND]

from transformers import AutoTokenizer

model_id = "CohereLabs/c4ai-command-r-plus-08-2024"
tokenizer = AutoTokenizer.from_pretrained(model_id)

# define conversation input:
conversation = [
    {"role": "user", "content": "Whats the biggest penguin in the world?"}
]
# define documents to ground on:
documents = [
    { "title": "Tall penguins", "text": "Emperor penguins are the tallest growing up to 122 cm in height." }, 
    { "title": "Penguin habitats", "text": "Emperor penguins only live in Antarctica."}
]

# render the tool use prompt as a string:
grounded_generation_prompt = tokenizer.apply_grounded_generation_template(
    conversation,
    documents=documents,
    citation_mode="accurate", # or "fast"
    tokenize=False,
    add_generation_prompt=True,
)
print(grounded_generation_prompt)

📚 Documentation

Model Details

Input: The model only takes text as input.

Output: The model only generates text.

Model Architecture: It is an auto - regressive language model using an optimized transformer architecture. After pretraining, it uses supervised fine - tuning (SFT) and preference training to align model behavior with human preferences for helpfulness and safety. Grouped query attention (GQA) is used to improve inference speed.

Languages covered: The model has been trained on 23 languages (English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Simplified Chinese, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian) and evaluated on 10 languages (English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Simplified Chinese).

Context length: Command R+ 08-2024 supports a context length of 128K.

🔧 Technical Details

The model's advanced capabilities such as grounded generation and multi - step tool use are achieved through a combination of supervised fine - tuning and preference training. The use of grouped query attention (GQA) in the architecture significantly improves the inference speed. The specific prompt template for grounded generation is carefully designed to train the model to generate accurate and useful responses with proper citations.

📄 License

The model is licensed under CC - BY - NC and requires adhering to Cohere Lab's Acceptable Use Policy.

Additional Information

License: cc - by - nc - 4.0
Inference: false
Library Name: transformers
Supported Languages: en, fr, de, es, it, pt, ja, ko, zh, ar
Extra Gated Prompt: "By submitting this form, you agree to the License Agreement and acknowledge that the information you provide will be collected, used, and shared in accordance with Cohere’s Privacy Policy. You’ll receive email updates about Cohere Labs and Cohere research, events, products and services. You can unsubscribe at any time."
Extra Gated Fields:
- Name: text
- Affiliation: text
- Country:
  - Type: select
  - Options: A long list of countries (as provided in the original document)
Usage Agreement: I agree to use this model for non - commercial use ONLY: checkbox

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご