C4AI Command R+ Open-source Research Model - Free Deployment for Retrieval Enhancement and Complex Task Automation

C4ai Command R Plus Fp8

Developed by FriendliAI

C4AI Command R+ is an open-weight 104 billion parameter research model with advanced capabilities, including Retrieval-Augmented Generation (RAG) and tool usage for automating complex tasks.

Large Language Model

Transformers

Supports Multiple Languages#128K Long Context Processing #Multilingual RAG #Toolchain Automation

Downloads 35

Release Time : 4/17/2024

Model Overview

C4AI Command R+ is a multilingual model supporting performance evaluation in 10 languages, optimized for various use cases such as reasoning, summarization, and question answering.

Model Features

Multilingual Support

Supports performance evaluation in 10 languages, including English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Chinese, and Arabic.

Retrieval-Augmented Generation (RAG)

Features advanced retrieval-augmented generation capabilities, enabling responses based on provided document snippets with citation markers to indicate sources.

Tool Usage

Supports multi-step tool usage to accomplish challenging tasks and automate complex workflows.

Long Context Support

Supports 128K context length, ideal for processing long documents and complex tasks.

Model Capabilities

Text Generation

Retrieval-Augmented Generation (RAG)

Tool Usage

Multilingual Support

Code Generation

Question Answering

Summarization

Use Cases

Natural Language Processing

Question Answering System

Leverages Retrieval-Augmented Generation (RAG) to answer user questions with citations to relevant document snippets.

High-accuracy multilingual question answering.

Document Summarization

Generates concise summaries from provided document snippets while retaining key information.

High-quality summaries for long documents.

Task Automation

Tool Automation

Automates complex tasks like data retrieval and processing through multi-step tool usage.

Efficient task automation with minimal manual intervention.

🚀 C4AI Command R+ FP8

This project offers the c4ai-command-r-plus model quantized to FP8 by FriendliAI. It significantly boosts inference efficiency while maintaining high accuracy.

✨ Features

Quantization: The model is quantized to FP8 by FriendliAI, enhancing inference efficiency.
Compatibility: Compatible with Friendli Container.
Multilingual Support: Supports multiple languages including English, French, German, Spanish, Italian, Portuguese, Japanese, Korean, Chinese, and Arabic.

📦 Installation

Sign up: Sign up for Friendli Suite. You can use Friendli Containers free of charge for four weeks.
Prepare PAT: Prepare a Personal Access Token following this guide.
Prepare Secret: Prepare a Friendli Container Secret following this guide.
Install Hugging Face CLI: Install Hugging Face CLI with pip install -U "huggingface_hub[cli]"

Preparing Personal Access Token

Sign in Friendli Suite.
Go to User Settings > Tokens and click 'Create new token'.
Save your created token value.

Pulling Friendli Container Image

export FRIENDLI_PAT="YOUR PAT"
docker login registry.friendli.ai -u $YOUR_EMAIL -p $FRIENDLI_PAT

Pull image:

docker pull registry.friendli.ai/trial

💻 Usage Examples

Running Friendli Container

docker run \
  --gpus '"device=0,1,2,3"' \
  -p 8000:8000 \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  -e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
  registry.friendli.ai/trial \
    --web-server-port 8000 \
    --hf-model-name FriendliAI/c4ai-command-r-plus-fp8 \
    --num-devices 4  # Use tensor parallelism degree 4

📚 Documentation

Original model card: CohereForAI's C4AI Command R+

C4AI Command R+ is an open weights research release of a 104B billion parameter model with advanced capabilities like Retrieval Augmented Generation (RAG) and tool use. It's a multilingual model optimized for various use cases.

Usage

# pip install 'git+https://github.com/huggingface/transformers.git'
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereForAI/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

Quantized model through bitsandbytes, 8-bit precision

# pip install 'git+https://github.com/huggingface/transformers.git' bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(load_in_8bit=True)

model_id = "CohereForAI/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config)

# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

Quantized model through bitsandbytes, 4-bit precision

This model is non - quantized version of C4AI Command R+. You can find the quantized version of C4AI Command R+ using bitsandbytes here.

Model Details

Input: The model takes text as input.
Output: The model generates text.
Model Architecture: It's an auto - regressive language model using an optimized transformer architecture. After pretraining, it uses supervised fine - tuning (SFT) and preference training.
Languages covered: Optimized for English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic. Pre - training data also included 13 other languages.
Context length: Supports a context length of 128K.

Evaluations

Model	Average	Arc (Challenge)	Hella Swag	MMLU	Truthful QA	Winogrande	GSM8k
CohereForAI/c4ai-command-r-plus	74.6	70.99	88.6	75.7	56.3	85.4	70.7
DBRX Instruct	74.5	68.9	89	73.7	66.9	81.8	66.9
Mixtral 8x7B-Instruct	72.7	70.1	87.6	71.4	65	81.1	61.1
Mixtral 8x7B Chat	72.6	70.2	87.6	71.2	64.6	81.4	60.7
CohereForAI/c4ai-command-r-v01	68.5	65.5	87	68.2	52.3	81.5	56.6
Llama 2 70B	67.9	67.3	87.3	69.8	44.9	83.7	54.1
Yi-34B-Chat	65.3	65.4	84.2	74.9	55.4	80.1	31.9
Gemma-7B	63.8	61.1	82.2	64.6	44.8	79	50.9
LLama 2 70B Chat	62.4	64.6	85.9	63.9	52.8	80.5	26.7
Mistral-7B-v0.1	61	60	83.3	64.2	42.2	78.4	37.8

Tool use & multihop capabilities

Command R+ has conversational tool use capabilities trained via supervised and preference fine - tuning. It takes a conversation and a list of tools as input and generates a json - formatted list of actions. It can recognize a directly_answer tool.

from transformers import AutoTokenizer

model_id = "CohereForAI/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)

# define conversation input:
conversation = [
    {"role": "user", "content": "Whats the biggest penguin in the world?"}
]
# Define tools available for the model to use:
tools = [
  {
    "name": "internet_search",
    "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
    "parameter_definitions": {
      "query": {
        "description": "Query to search the internet with",
        "type": 'str',
        "required": True
      }
    }
  },
  {
    'name': "direct"
  }
]

🔧 Technical Details

Model creator: CohereForAI
Original model: c4ai-command-r-plus
Quantized by: FriendliAI
Model type: Text generation
Library name: transformers
Supported languages: English, French, German, Spanish, Italian, Portuguese, Japanese, Korean, Chinese, Arabic

📄 License

This model is under the CC - BY - NC license and requires adhering to C4AI's Acceptable Use Policy.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご