Model Overview
Model Features
Model Capabilities
Use Cases
🚀 Jais-13b
Jais-13b is a 13-billion parameter pre-trained bilingual large language model supporting both Arabic and English. It offers high - quality language processing capabilities for these two languages.
🚀 Quick Start
Below is a sample code to use the model. Note that the model requires a custom model class, so you must enable trust_remote_code=True
while loading the model. Also, this code is tested on transformers==4.28.0
.
# -*- coding: utf-8 -*-
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "core42/jais-13b"
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)
def get_response(text,tokenizer=tokenizer,model=model):
input_ids = tokenizer(text, return_tensors="pt").input_ids
inputs = input_ids.to(device)
input_len = inputs.shape[-1]
generate_ids = model.generate(
inputs,
top_p=0.9,
temperature=0.3,
max_length=200-input_len,
min_length=input_len + 4,
repetition_penalty=1.2,
do_sample=True,
)
response = tokenizer.batch_decode(
generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True
)[0]
return response
text= "عاصمة دولة الإمارات العربية المتحدة ه"
print(get_response(text))
text = "The capital of UAE is"
print(get_response(text))
✨ Features
- Bilingual Support: It supports both Arabic and English, trained on a large - scale bilingual dataset.
- Advanced Architecture: Based on the transformer - based decoder - only (GPT - 3) architecture with SwiGLU non - linearity and ALiBi position embeddings, enabling better context handling and precision.
📚 Documentation
Model Details
Property | Details |
---|---|
Developed by | Inception, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), and Cerebras Systems |
Language(s) (NLP) | Arabic and English |
License | Apache 2.0 |
Input | Text only data |
Output | Model generates text |
Paper | Jais and Jais - chat: Arabic - Centric Foundation and Instruction - Tuned Open Generative Large Language Models |
Demo | [Access here](https://arabic - gpt.ai) |
Intended Use
We release the Jais 13B model under a full open - source license and welcome all feedback and collaboration opportunities.
This model is the first release from the Inception - MBZUAI - Cerebras partnership. At the time of release, it achieved state - of - the - art across a comprehensive Arabic test suite as described in the accompanying technical report.
Some potential downstream uses include:
- Research: Can be used by researchers and developers.
- Commercial Use: Can be used as a base model for further fine - tuning for specific use cases (similar to [jais - 13b - chat](https://huggingface.co/inception - mbzuai/jais - 13b - chat)). Some potential use cases are chat - assistants and customer service.
Audiences that may benefit from our model:
- Academics: For those researching Arabic natural language processing.
- Businesses: Companies targeting Arabic - speaking audiences.
- Developers: Those integrating Arabic language capabilities in apps.
Out - of - Scope Use
While Jais - 13b is a powerful Arabic and English bilingual model, it's important to understand its limitations and the potential for misuse. It is prohibited to use the model in any way that violates applicable laws or regulations.
The following are some example scenarios where the model should not be used:
- Malicious Use: Should not be used for generating harmful, misleading, or inappropriate content, such as hate speech, violence promotion, discrimination, spreading misinformation or fake news, and engaging in or promoting illegal activities.
- Sensitive Information: Should not be used to handle or generate personal, confidential, or sensitive information.
- Generalization Across All Languages: Jais - 13b is bilingual and optimized for Arabic and English. Do not assume it has equal proficiency in other languages or dialects.
- High - Stakes Decisions: Should not be used to make high - stakes decisions without human oversight, including medical, legal, financial, or safety - critical decisions.
Bias, Risks, and Limitations
The model is trained on publicly available data partially curated by Inception. Different techniques have been employed to reduce bias, but like all LLM models, it may still exhibit some bias.
The model is trained as an AI assistant for Arabic and English speakers. It is limited to producing responses for queries in these two languages and may not produce appropriate responses to other language queries.
By using Jais, you acknowledge and accept that, like any large language model, it may generate incorrect, misleading, and/or offensive information or content. The information is not intended as advice and should not be relied upon. We are not responsible for any content or consequences resulting from its use. We are continuously working to develop models with greater capabilities and welcome any feedback on the model.
Copyright Inception Institute of Artificial Intelligence Ltd. JAIS is made available under the Apache License, Version 2.0 (the “License”). You shall not use JAIS except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE - 2.0.
Unless required by applicable law or agreed to in writing, JAIS is distributed on an AS IS basis, without warranties or conditions of any kind, either express or implied. Please see the terms of the License for the specific language permissions and limitations under the License.
Training Details
Training Data
For the pre - training of Jais - 13b, a diverse bilingual corpus sourced from the Web and other sources was used, along with publicly available English and code datasets.
To collect Arabic data, multiple sources were used, including web pages, Wikipedia articles, news articles, Arabic books, and social network content. The volume of Arabic data was augmented by translating English to Arabic using an in - house machine translation system, restricted to high - quality English resources such as English Wikipedia and English books. Further details about the training data can be found in the technical report.
Training Procedure
Training was performed on the Condor Galaxy 1 (CG - 1) supercomputer platform.
Training Hyperparameters
Hyperparameter | Value |
---|---|
Precision | fp32 |
Optimizer | AdamW |
Learning rate | 0 to 0.012 (<= 95 steps); 0.012 to 0.0012 (> 95 steps) |
Weight decay | 0.1 |
Batch size | 1920 |
Steps | 100551 |
Evaluation
We conducted a comprehensive evaluation of Jais and benchmarked it against other leading base language models, focusing on both English and Arabic. The evaluation criteria spanned various dimensions, including knowledge, reasoning, and assessment of misinformation/bias.
Arabic evaluation results:
Models | Avg | EXAMS | MMLU (M) | LitQA | Hellaswag | PIQA | BoolQA | SituatedQA | ARC - C | OpenBookQA | TruthfulQA | CrowS - Pairs |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Jais (13B) | 46.5 | 40.4 | 30.0 | 58.3 | 57.7 | 67.6 | 62.6 | 42.5 | 35.8 | 32.4 | 41.1 | 58.4 |
BLOOM (7.1B) | 40.9 | 34.0 | 28.2 | 37.1 | 40.9 | 58.4 | 59.9 | 39.1 | 27.3 | 28.0 | 44.4 | 53.5 |
LLaMA2 (13B) | 38.1 | 29.2 | 28.4 | 32.0 | 34.3 | 52.9 | 63.8 | 36.4 | 24.3 | 30.0 | 45.5 | 49.9 |
AraT5 (220M) | 32.0 | 24.7 | 23.8 | 26.3 | 25.5 | 50.4 | 58.2 | 33.9 | 24.7 | 25.4 | 20.9 | 47.2 |
AraBART (139M) | 36.7 | 26.5 | 27.5 | 34.3 | 28.1 | 52.6 | 57.1 | 34.6 | 25.1 | 28.6 | 49.8 | 48.8 |
All tasks above report accuracy or F1 scores (the higher the better). For brevity, English task results are not included. Detailed comparisons in both languages and evaluation dataset details can be found in the technical report.
📄 License
This project is licensed under the Apache 2.0 license.
📖 Citation
@misc{sengupta2023jais,
title={Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models},
author={Neha Sengupta and Sunil Kumar Sahu and Bokang Jia and Satheesh Katipomu and Haonan Li and Fajri Koto and Osama Mohammed Afzal and Samta Kamboj and Onkar Pandit and Rahul Pal and Lalit Pradhan and Zain Muhammad Mujahid and Massa Baali and Alham Fikri Aji and Zhengzhong Liu and Andy Hock and Andrew Feldman and Jonathan Lee and Andrew Jackson and Preslav Nakov and Timothy Baldwin and Eric Xing},
year={2023},
eprint={2308.16149},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Copyright Inception Institute of Artificial Intelligence Ltd.

