JAIS-13B Open-Source Bilingual Large Language Model - Free Deployment Supports Arabic and English Dialogue Applications

Jais 13b

Developed by inceptionai

JAIS-13B is a 13-billion-parameter bilingual (Arabic and English) pre-trained large language model based on the GPT-3 architecture, specifically optimized for Arabic and English.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Arabic-English Bilingual #ALiBi Long Sequence Processing #Arabic NLP Optimization

Downloads 1,051

Release Time : 8/17/2023

Model Overview

JAIS-13B is a powerful Arabic-English bilingual large language model built on the Transformer architecture, supporting text generation tasks. It achieves state-of-the-art performance in Arabic language processing, suitable for both research and commercial use.

Model Features

Bilingual Capability

Optimized for Arabic and English, with exceptional performance in Arabic language processing

Long Sequence Processing

Utilizes ALiBi positional embedding technology to support extrapolation for long sequences

Open-Source License

Released under the Apache 2.0 license, permitting both research and commercial use

Model Capabilities

Arabic Text Generation

English Text Generation

Code Generation

Question Answering Systems

Chatbots

Use Cases

Research Applications

Arabic NLP Research

Used for Arabic natural language processing research

Commercial Applications

Customer Service

Fine-tuned as a base model for specific scenarios in customer service

Chat Assistants

Develop chat assistants for Arabic-speaking users

🚀 Jais-13b

Jais-13b is a 13-billion parameter pre-trained bilingual large language model supporting both Arabic and English. It offers high - quality language processing capabilities for these two languages.

🚀 Quick Start

Below is a sample code to use the model. Note that the model requires a custom model class, so you must enable trust_remote_code=True while loading the model. Also, this code is tested on transformers==4.28.0.

# -*- coding: utf-8 -*-

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "core42/jais-13b"

device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)


def get_response(text,tokenizer=tokenizer,model=model):
    input_ids = tokenizer(text, return_tensors="pt").input_ids
    inputs = input_ids.to(device)
    input_len = inputs.shape[-1]
    generate_ids = model.generate(
        inputs,
        top_p=0.9,
        temperature=0.3,
        max_length=200-input_len,
        min_length=input_len + 4,
        repetition_penalty=1.2,
        do_sample=True,
    )
    response = tokenizer.batch_decode(
        generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True
    )[0]
    return response


text= "عاصمة دولة الإمارات العربية المتحدة ه"
print(get_response(text))

text = "The capital of UAE is"
print(get_response(text))

✨ Features

Bilingual Support: It supports both Arabic and English, trained on a large - scale bilingual dataset.
Advanced Architecture: Based on the transformer - based decoder - only (GPT - 3) architecture with SwiGLU non - linearity and ALiBi position embeddings, enabling better context handling and precision.

📚 Documentation

Model Details

Property	Details
Developed by	Inception, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), and Cerebras Systems
Language(s) (NLP)	Arabic and English
License	Apache 2.0
Input	Text only data
Output	Model generates text
Paper	Jais and Jais - chat: Arabic - Centric Foundation and Instruction - Tuned Open Generative Large Language Models
Demo	[Access here](https://arabic - gpt.ai)

Intended Use

We release the Jais 13B model under a full open - source license and welcome all feedback and collaboration opportunities.

This model is the first release from the Inception - MBZUAI - Cerebras partnership. At the time of release, it achieved state - of - the - art across a comprehensive Arabic test suite as described in the accompanying technical report.

Some potential downstream uses include:

Research: Can be used by researchers and developers.
Commercial Use: Can be used as a base model for further fine - tuning for specific use cases (similar to [jais - 13b - chat](https://huggingface.co/inception - mbzuai/jais - 13b - chat)). Some potential use cases are chat - assistants and customer service.

Audiences that may benefit from our model:

Academics: For those researching Arabic natural language processing.
Businesses: Companies targeting Arabic - speaking audiences.
Developers: Those integrating Arabic language capabilities in apps.

Out - of - Scope Use

While Jais - 13b is a powerful Arabic and English bilingual model, it's important to understand its limitations and the potential for misuse. It is prohibited to use the model in any way that violates applicable laws or regulations.

The following are some example scenarios where the model should not be used:

Malicious Use: Should not be used for generating harmful, misleading, or inappropriate content, such as hate speech, violence promotion, discrimination, spreading misinformation or fake news, and engaging in or promoting illegal activities.
Sensitive Information: Should not be used to handle or generate personal, confidential, or sensitive information.
Generalization Across All Languages: Jais - 13b is bilingual and optimized for Arabic and English. Do not assume it has equal proficiency in other languages or dialects.
High - Stakes Decisions: Should not be used to make high - stakes decisions without human oversight, including medical, legal, financial, or safety - critical decisions.

Bias, Risks, and Limitations

The model is trained on publicly available data partially curated by Inception. Different techniques have been employed to reduce bias, but like all LLM models, it may still exhibit some bias.

The model is trained as an AI assistant for Arabic and English speakers. It is limited to producing responses for queries in these two languages and may not produce appropriate responses to other language queries.

By using Jais, you acknowledge and accept that, like any large language model, it may generate incorrect, misleading, and/or offensive information or content. The information is not intended as advice and should not be relied upon. We are not responsible for any content or consequences resulting from its use. We are continuously working to develop models with greater capabilities and welcome any feedback on the model.

Copyright Inception Institute of Artificial Intelligence Ltd. JAIS is made available under the Apache License, Version 2.0 (the “License”). You shall not use JAIS except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE - 2.0.

Unless required by applicable law or agreed to in writing, JAIS is distributed on an AS IS basis, without warranties or conditions of any kind, either express or implied. Please see the terms of the License for the specific language permissions and limitations under the License.

Training Details

Training Data

For the pre - training of Jais - 13b, a diverse bilingual corpus sourced from the Web and other sources was used, along with publicly available English and code datasets.

To collect Arabic data, multiple sources were used, including web pages, Wikipedia articles, news articles, Arabic books, and social network content. The volume of Arabic data was augmented by translating English to Arabic using an in - house machine translation system, restricted to high - quality English resources such as English Wikipedia and English books. Further details about the training data can be found in the technical report.

Training Procedure

Training was performed on the Condor Galaxy 1 (CG - 1) supercomputer platform.

Training Hyperparameters

Hyperparameter	Value
Precision	fp32
Optimizer	AdamW
Learning rate	0 to 0.012 (<= 95 steps); 0.012 to 0.0012 (> 95 steps)
Weight decay	0.1
Batch size	1920
Steps	100551

Evaluation

We conducted a comprehensive evaluation of Jais and benchmarked it against other leading base language models, focusing on both English and Arabic. The evaluation criteria spanned various dimensions, including knowledge, reasoning, and assessment of misinformation/bias.

Arabic evaluation results:

Models	Avg	EXAMS	MMLU (M)	LitQA	Hellaswag	PIQA	BoolQA	SituatedQA	ARC - C	OpenBookQA	TruthfulQA	CrowS - Pairs
Jais (13B)	46.5	40.4	30.0	58.3	57.7	67.6	62.6	42.5	35.8	32.4	41.1	58.4
BLOOM (7.1B)	40.9	34.0	28.2	37.1	40.9	58.4	59.9	39.1	27.3	28.0	44.4	53.5
LLaMA2 (13B)	38.1	29.2	28.4	32.0	34.3	52.9	63.8	36.4	24.3	30.0	45.5	49.9
AraT5 (220M)	32.0	24.7	23.8	26.3	25.5	50.4	58.2	33.9	24.7	25.4	20.9	47.2
AraBART (139M)	36.7	26.5	27.5	34.3	28.1	52.6	57.1	34.6	25.1	28.6	49.8	48.8

All tasks above report accuracy or F1 scores (the higher the better). For brevity, English task results are not included. Detailed comparisons in both languages and evaluation dataset details can be found in the technical report.

📄 License

This project is licensed under the Apache 2.0 license.

📖 Citation

@misc{sengupta2023jais,
      title={Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models}, 
      author={Neha Sengupta and Sunil Kumar Sahu and Bokang Jia and Satheesh Katipomu and Haonan Li and Fajri Koto and Osama Mohammed Afzal and Samta Kamboj and Onkar Pandit and Rahul Pal and Lalit Pradhan and Zain Muhammad Mujahid and Massa Baali and Alham Fikri Aji and Zhengzhong Liu and Andy Hock and Andrew Feldman and Jonathan Lee and Andrew Jackson and Preslav Nakov and Timothy Baldwin and Eric Xing},
      year={2023},
      eprint={2308.16149},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご