The c4ai-command-r-plus-4bit open-source large language model - Support for multilingual long-text interaction and advanced function applications

C4ai Command R Plus 4bit

Developed by CohereLabs

Cohere Labs Command R+ is a multilingual large language model with 104 billion parameters, featuring advanced capabilities such as retrieval-augmented generation (RAG) and tool usage, and supporting a context length of 128K.

Large Language Model

Transformers

Supports Multiple Languages#Multi-tool collaborative reasoning #Multi-language RAG generation #128K long text processing

Downloads 316

Release Time : 4/3/2024

Model Overview

Command R+ is an open-weight research version model developed by Cohere, focusing on complex task processing, such as reasoning, summarization, and Q&A, and supporting 10 major languages.

Model Features

Multi-hop tool usage

Supports multi-step tool invocation combinations and can generate action lists in JSON format to execute complex task flows

Retrieval-augmented generation (RAG)

Can generate responses with factual references based on provided document fragments, supporting both accurate and fast reference modes

Ultra-long context

Supports a context window of 128K tokens, suitable for processing long documents and complex dialogue scenarios

Multi-language optimization

Specifically optimized for 10 major languages, with additional pre-training support for 13 languages

Model Capabilities

Multi-language text generation

Complex task automation

Factual Q&A

Multi-document summarization

Tool invocation integration

Long context understanding

Use Cases

Knowledge Q&A

Fact-checking

Provide accurate answers with references based on the latest documents

Generate answers containing factual source markers such as [1][2]

Enterprise automation

Workflow automation

Complete multi-step business processes by combining API tool invocations

Automatically generate tool invocation JSON containing parameters

Content processing

Long document analysis

Process documents up to 128K tokens to extract key information

Generate summary reports with chapter references

🚀 Cohere Labs Command R+

Cohere Labs Command R+ is a 104B parameter model with advanced capabilities. It supports Retrieval Augmented Generation (RAG) and tool use for complex tasks. Evaluated in 10 languages, it's optimized for reasoning, summarization, and question - answering.

🚀 Quick Start

Try Cohere Labs Command R+

You can try out Cohere Labs Command R+ before downloading the weights in our hosted Hugging Face Space.

Usage

Please install transformers from the source repository that includes the necessary changes for this model.

# pip install 'git+https://github.com/huggingface/transformers.git' bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

✨ Features

Multilingual Capability: Evaluated in 10 languages (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese), with pre - training data including 13 additional languages.
Tool Use: Supports multi - step tool use, allowing the model to combine multiple tools over multiple steps to accomplish difficult tasks.
Grounded Generation and RAG: Can generate responses based on supplied document snippets and include grounding spans (citations) in its response.

📦 Installation

Please install transformers from the source repository that includes the necessary changes for this model.

pip install 'git+https://github.com/huggingface/transformers.git' bitsandbytes accelerate

💻 Usage Examples

Basic Usage

# pip install 'git+https://github.com/huggingface/transformers.git' bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

Advanced Usage - Tool Use

from transformers import AutoTokenizer

model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)

# define conversation input:
conversation = [
    {"role": "user", "content": "Whats the biggest penguin in the world?"}
]
# Define tools available for the model to use:
tools = [
  {
    "name": "internet_search",
    "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
    "parameter_definitions": {
      "query": {
        "description": "Query to search the internet with",
        "type": 'str',
        "required": True
      }
    }
  },
  {
    'name': "directly_answer",
    "description": "Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history",
    'parameter_definitions': {}
  }
]

# render the tool use prompt as a string:
tool_use_prompt = tokenizer.apply_tool_use_template(
    conversation,
    tools=tools,
    tokenize=False,
    add_generation_prompt=True,
)
print(tool_use_prompt)

Advanced Usage - Grounded Generation

from transformers import AutoTokenizer

model_id = "CohereForAI/c4ai-command-r-plus-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)

# define conversation input:
conversation = [
    {"role": "user", "content": "Whats the biggest penguin in the world?"}
]
# define documents to ground on:
documents = [
    { "title": "Tall penguins", "text": "Emperor penguins are the tallest growing up to 122 cm in height." }, 
    { "title": "Penguin habitats", "text": "Emperor penguins only live in Antarctica."}
]

# render the tool use prompt as a string:
grounded_generation_prompt = tokenizer.apply_grounded_generation_template(
    conversation,
    documents=documents,
    citation_mode="accurate", # or "fast"
    tokenize=False,
    add_generation_prompt=True,
)
print(grounded_generation_prompt)

📚 Documentation

Model Details

Property	Details
Model Type	Auto - regressive language model using an optimized transformer architecture
Training Data	Pre - training data includes 23 languages (10 for evaluation and 13 additional)
Model Size	104 billion parameters
Context length	128K

Input: The model takes text as input. Output: The model generates text as output. Model Architecture: It's an auto - regressive language model with an optimized transformer architecture. After pretraining, it uses supervised fine - tuning (SFT) and preference training to align with human preferences.

Tool use & multihop capabilities

Command R+ has been specifically trained with conversational tool use capabilities. It takes a conversation and a list of available tools as input and generates a json - formatted list of actions. Comprehensive documentation for working with command R+'s tool use prompt template can be found here. It also supports Hugging Face's tool use API.

Grounded Generation and RAG Capabilities

Command R+ can generate responses based on supplied document snippets and include grounding spans (citations). This behavior has been trained using a specific prompt template. Comprehensive documentation for working with Command R+'s grounded generation prompt template can be found here.

🔧 Technical Details

Tool Use Training: Command R+ is trained for tool use via a mixture of supervised fine - tuning and preference fine - tuning, using a specific prompt template.
Grounded Generation Training: Grounded generation capabilities are also trained through a combination of supervised fine - tuning and preference fine - tuning with a specific prompt template.

📄 License

License: CC - BY - NC, requires also adhering to Cohere Lab's Acceptable Use Policy

⚠️ Important Note

This model is a 4bit quantized version of Cohere Labs Command R+ using bitsandbytes. You can find the unquantized version of Cohere Labs Command R+ here.

💡 Usage Tip

When using the tool use and grounded generation capabilities, following the specific prompt templates is recommended to achieve better performance, but experimentation is also encouraged.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご