EuroLLM-9B Open-Source Multilingual Large Language Model - Supports 35 Languages, Focused on Text Generation for EU Languages

Eurollm 9B

Developed by utter-project

EuroLLM-9B is a 9-billion-parameter multilingual large language model supporting 35 languages, specializing in text understanding and generation for EU languages and related languages.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #EU Multilingual #GQA Accelerated Inference #4 Trillion Token Training

Downloads 1,676

Release Time : 11/22/2024

Model Overview

EuroLLM-9B is an EU-funded multilingual Transformer-based large language model designed to understand and generate text in all EU languages and some other related languages.

Model Features

Multilingual Support

Supports 35 languages with special focus on EU languages, offering extensive language coverage.

Efficient Architecture

Utilizes technologies like Grouped Query Attention (GQA) and pre-layer normalization to improve inference speed while maintaining performance.

Large-Scale Training

Trained on 4 trillion tokens from diverse language sources, ensuring robust multilingual capabilities.

EU-Funded Project

As part of an EU-funded project, it specifically addresses European language and cultural needs.

Model Capabilities

Multilingual text generation

Machine translation

Instruction following

Multilingual question answering

Use Cases

Multilingual Applications

Multilingual Chatbot

Build chatbots supporting multiple EU languages

Outperforms in multilingual benchmarks

Cross-Language Translation

Enable mutual translation between EU languages

Performs well in machine translation tasks

Education

Language Learning Assistant

Assist in learning various EU languages

🚀 EuroLLM-9B

A 9B parameter multilingual transformer LLM capable of handling a wide range of languages.

🚀 Quick Start

Here is a simple example of how to run the EuroLLM-9B model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "utter-project/EuroLLM-9B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

text = "English: My name is EuroLLM. Portuguese:"

inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=20)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

✨ Features

Multilingual Support: Capable of understanding and generating text in a wide range of languages, including all European Union languages and some additional relevant languages.
Optimized Architecture: Uses grouped query attention (GQA) with 8 key - value heads, pre - layer normalization with RMSNorm, SwiGLU activation function, and rotary positional embeddings (RoPE).
Large - scale Training: Trained on 4 trillion tokens across multiple data sources.

📦 Installation

The installation process mainly involves installing the transformers library. You can install it via pip:

pip install transformers

📚 Documentation

Model Details

The EuroLLM project aims to create a suite of LLMs that can understand and generate text in all European Union languages and some additional relevant languages. EuroLLM-9B is a 9B parameter model trained on 4 trillion tokens from various data sources, including web data, parallel data, and high - quality datasets. EuroLLM-9B-Instruct was further instruction tuned on EuroBlocks.

Property	Details
Model Type	A 9B parameter multilingual transfomer LLM
Languages (NLP)	Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish, Arabic, Catalan, Chinese, Galician, Hindi, Japanese, Korean, Norwegian, Russian, Turkish, and Ukrainian
License	Apache License 2.0
Developed by	Unbabel, Instituto Superior Técnico, Instituto de Telecomunicações, University of Edinburgh, Aveni, University of Paris - Saclay, University of Amsterdam, Naver Labs, Sorbonne Université
Funded by	European Union

Model Description

EuroLLM uses a standard, dense Transformer architecture:

Grouped query attention (GQA) with 8 key - value heads to increase inference speed while maintaining downstream performance.
Pre - layer normalization with RMSNorm for improved training stability and faster computation.
SwiGLU activation function for good results on downstream tasks.
Rotary positional embeddings (RoPE) in every layer for good performance and context length extension.

Here is a summary of the model hyper - parameters:

Property	Details
Sequence Length	4,096
Number of Layers	42
Embedding Size	4,096
FFN Hidden Size	12,288
Number of Heads	32
Number of KV Heads (GQA)	8
Activation Function	SwiGLU
Position Encodings	RoPE (\Theta = 10,000)
Layer Norm	RMSNorm
Tied Embeddings	No
Embedding Parameters	0.524B
LM Head Parameters	0.524B
Non - embedding Parameters	8.105B
Total Parameters	9.154B

Results

EU Languages

image/png Table 1: Comparison of open - weight LLMs on multilingual benchmarks. The borda count corresponds to the average ranking of the models (see (Colombo et al., 2022)). For Arc - challenge, Hellaswag, and MMLU we are using Okapi datasets ([Lai et al., 2023](https://aclanthology.org/2023.emnlp - demo.28/)) which include 11 languages. For MMLU - Pro and MUSR we translate the English version with Tower (Alves et al., 2024) to 6 EU languages.
* As there are no public versions of the pre - trained models, we evaluated them using the post - trained versions.

EuroLLM-9B shows superior performance on multilingual tasks compared to other European - developed models and strong competitiveness with non - European models.

English

image/png

Table 2: Comparison of open - weight LLMs on English general benchmarks.
* As there are no public versions of the pre - trained models, we evaluated them using the post - trained versions.

EuroLLM shows strong performance on English tasks, surpassing most European - developed models and matching the performance of Mistral - 7B.

Bias, Risks, and Limitations

⚠️ Important Note

EuroLLM-9B has not been aligned to human preferences, so the model may generate problematic outputs (e.g., hallucinations, harmful content, or false statements).

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご