Aya Expanse 32B Open-Source Multilingual Large Model - Supports Communication in 23 Languages!

Aya Expanse 32b

Developed by CohereLabs

Aya Expanse 32B is an open-weight multilingual large language model supporting 23 languages, combining high-performance pretraining with multilingual preference training techniques.

Large Language Model

Transformers

Supports Multiple Languages#23 language support #128K long context #Multilingual writing assistant

Downloads 9,666

Release Time : 10/23/2024

Model Overview

Aya Expanse 32B is a powerful multilingual large language model supporting 23 languages, suitable for multilingual text generation and comprehension tasks.

Model Features

Multilingual support

Supports 23 languages, including major European and Asian languages.

High-performance pretraining

Incorporates high-performance pretraining techniques from the Command series models.

Safety tuning

Undergone safety tuning to ensure the security of generated content.

Long context support

Supports context lengths of up to 128K.

Model Capabilities

Multilingual text generation

Multilingual Q&A

Multilingual writing assistance

Multilingual dialogue systems

Use Cases

Writing assistance

Multilingual letter writing

Helps users write letters in different languages, such as letters to mothers.

Generates emotionally rich, grammatically correct multilingual letters.

Q&A systems

Multilingual Q&A

Answers user questions posed in different languages.

Provides accurate, relevant multilingual answers.

Content creation

Multilingual content generation

Generates multilingual blog posts, stories, and other content.

Produces fluent, coherent multilingual content.

🚀 Model Card for Aya-Expanse-32B

Aya Expanse 32B is an open - weight research release model with highly advanced multilingual capabilities. It combines a high - performing pre - trained Command family of models with a year's worth of dedicated research from Cohere Labs. This includes data arbitrage, multilingual preference training, safety tuning, and model merging. The outcome is a powerful multilingual large language model supporting 23 languages.

This model card pertains to the 32 - billion version of the Aya Expanse model. We also released an 8 - billion version, which can be found here.

🚀 Quick Start

Try it: Aya Expanse in Action

Use the Cohere playground or our Hugging Face Space for interactive exploration.

How to Use Aya Expanse

Install the transformers library and load Aya Expanse 32B as follows:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereLabs/aya-expanse-32b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Format message with the chat template
messages = [{"role": "user", "content": "Anneme onu ne kadar sevdiğimi anlatan bir mektup yaz"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Anneme onu ne kadar sevdiğimi anlatan bir mektup yaz<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

✨ Features

Multilingual Capabilities: Supports 23 languages, including Arabic, Chinese (simplified & traditional), Czech, Dutch, English, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish, Ukrainian, and Vietnamese.
Optimized Architecture: An auto - regressive language model using an optimized transformer architecture, with post - training including supervised finetuning, preference training, and model merging.
Long Context Length: Has a context length of 128K.

📚 Documentation

Model Details

Property	Details
Input	The model takes text as input only.
Output	The model generates text only.
Model Architecture	Aya Expanse 32B is an auto - regressive language model that uses an optimized transformer architecture. Post - training includes supervised finetuning, preference training, and model merging.
Languages covered	The model is particularly optimized for multilinguality and supports Arabic, Chinese (simplified & traditional), Czech, Dutch, English, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish, Ukrainian, and Vietnamese.
Context length	128K

Example Notebooks

Fine - Tuning: Detailed Fine - Tuning Notebook.
Community - Contributed Use Cases:

Evaluation

We evaluated Aya Expanse 32B against Gemma 2 27B, Llama 3.1 70B, Mixtral 8x22B, and Qwen 2.5 35B using the dolly_human_edited subset from the Aya Evaluation Suite dataset and m - ArenaHard, a dataset based on the Arena - Hard - Auto dataset and translated to the 23 languages supported by Aya Expanse. Win - rates were determined using gpt - 4o - 2024 - 08 - 06 as a judge. For a conservative benchmark, we report results from gpt - 4o - 2024 - 08 - 06, though gpt - 4o - mini scores showed even stronger performance.

The m - ArenaHard dataset, used to evaluate Aya Expanse’s capabilities, is publicly available here.

Win - rates in m - ArenaHard Win - rates in dolly_human_edited

Whatsapp Integration

You can also talk to Aya Expanse through the popular messaging service WhatsApp. Use this link to open a WhatsApp chatbox with Aya Expanse. If you don’t have WhatsApp downloaded on your machine you might need to do that, or, if you have it on your phone, you can follow the on - screen instructions to link your phone and WhatsApp Web. By the end, you should see a text window which you can use to chat with the model. More details about our WhatsApp integration are available here.

📄 License

This model is governed by CC - BY - NC and requires adhering to Cohere Lab's Acceptable Use Policy.
Extra Gated Prompt: By submitting this form, you agree to the License Agreement and acknowledge that the information you provide will be collected, used, and shared in accordance with Cohere’s Privacy Policy. You’ll receive email updates about Cohere Labs and Cohere research, events, products and services. You can unsubscribe at any time.

🔧 Technical Details

Model Information

Developed by: Cohere Labs
Point of Contact: Cohere Labs
Model: Aya Expanse 32B
Model Size: 32 billion parameters

Cite

You can cite Aya Expanse using:

@misc{dang2024ayaexpansecombiningresearch,
      title={Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier}, 
      author={John Dang and Shivalika Singh and Daniel D'souza and Arash Ahmadian and Alejandro Salamanca and Madeline Smith and Aidan Peppin and Sungjin Hong and Manoj Govindassamy and Terrence Zhao and Sandra Kublik and Meor Amer and Viraat Aryabumi and Jon Ander Campos and Yi-Chern Tan and Tom Kocmi and Florian Strub and Nathan Grinsztajn and Yannis Flet-Berliac and Acyr Locatelli and Hangyu Lin and Dwarak Talupuru and Bharat Venkitesh and David Cairuz and Bowen Yang and Tim Chung and Wei-Yin Ko and Sylvie Shang Shi and Amir Shukayev and Sammie Bae and Aleksandra Piktus and Roman Castagné and Felipe Cruz-Salinas and Eddie Kim and Lucas Crawhall-Stein and Adrien Morisot and Sudip Roy and Phil Blunsom and Ivan Zhang and Aidan Gomez and Nick Frosst and Marzieh Fadaee and Beyza Ermis and Ahmet Üstün and Sara Hooker},
      year={2024},
      eprint={2412.04261},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2412.04261}, 
}

Model Card Contact

For errors or additional questions about details in this model card, contact labs@cohere.com

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご