🚀 C4AI Command R 08 - 2024
Cohere Labs Command R 08 - 2024 is a high - performance generative model with 32 billion parameters. It's optimized for various use cases like reasoning, summarization, and question - answering, and has multilingual generation capabilities.
🚀 Quick Start
Usage
Please use transformers
version 4.39.1 or higher.
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "CohereLabs/c4ai-command-r-08-2024"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
Try Cohere Labs Command R
If you want to try Command R before downloading the weights, the model is hosted in a hugging face space here.
✨ Features
- Multilingual Generation: Trained on 23 languages and evaluated in 10 languages, enabling it to handle a wide range of language - related tasks.
- High - Performance RAG: Has highly performant Retrieval Augmented Generation (RAG) capabilities.
- Optimized for Multiple Use Cases: Ideal for reasoning, summarization, and question - answering.
📦 Installation
Ensure you have transformers
version 4.39.1 or higher installed. You can install it using the following command:
pip install 'transformers>=4.39.1'
💻 Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "CohereLabs/c4ai-command-r-08-2024"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
Advanced Usage - Grounded Generation
You can render the Grounded Generation prompt template by using the function apply_grounded_generation_template()
. The following code shows a minimal working example:
from transformers import AutoTokenizer
model_id = "CohereLabs/c4ai-command-r-08-2024"
tokenizer = AutoTokenizer.from_pretrained(model_id)
conversation = [
{"role": "user", "content": "Whats the biggest penguin in the world?"}
]
documents = [
{ "title": "Tall penguins", "text": "Emperor penguins are the tallest growing up to 122 cm in height." },
{ "title": "Penguin habitats", "text": "Emperor penguins only live in Antarctica."}
]
grounded_generation_prompt = tokenizer.apply_grounded_generation_template(
conversation,
documents=documents,
citation_mode="accurate",
tokenize=False,
add_generation_prompt=True,
)
print(grounded_generation_prompt)
📚 Documentation
Model Details
- Input: The model only takes text as input.
- Output: The model only generates text as output.
- Model Architecture: It is an auto - regressive language model using an optimized transformer architecture. After pretraining, it uses supervised fine - tuning (SFT) and preference training to align with human preferences for helpfulness and safety. Grouped query attention (GQA) is used to improve inference speed.
- Languages covered: Trained on 23 languages (English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Simplified Chinese, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian) and evaluated on 10 languages (English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Simplified Chinese).
- Context length: Supports a context length of 128K.
Grounded Generation and RAG Capabilities
Command R 08 - 2024 has been specifically trained with grounded generation capabilities. It can generate responses based on supplied document snippets and include grounding spans (citations) in its response.
Comprehensive documentation for working with Command R 08 - 2024's grounded generation prompt template can be found here, here and here.
🔧 Technical Details
The model is an auto - regressive language model. After pretraining, supervised fine - tuning (SFT) and preference training are used to align the model's behavior with human preferences for helpfulness and safety. Grouped query attention (GQA) is applied to enhance inference speed.
📄 License
Model Information Table
Other Information
- Inference: false
- Supported Languages: en, fr, de, es, it, pt, ja, ko, zh, ar
- Library Name: transformers
- Extra Gated Prompt: "By submitting this form, you agree to the License Agreement and acknowledge that the information you provide will be collected, used, and shared in accordance with Cohere’s Privacy Policy. You’ll receive email updates about Cohere Labs and Cohere research, events, products and services. You can unsubscribe at any time."
- Extra Gated Fields:
- Name: text
- Affiliation: text
- Country:
- type: select
- options: A long list of countries as provided in the original document.
- Agreement: I agree to use this model for non - commercial use ONLY (checkbox)