ArabianGPT-03B Open-Source Language Model - Free Deployment to Boost Arabic Communication and Applications

Arabiangpt 03B

Developed by riotu-lab

ArabianGPT-0.3B is a GPT-2 model optimized specifically for Arabic, developed by the Robotics and Internet of Things Laboratory at Prince Sultan University, and optimized for the complex characteristics of Arabic.

Large Language Model

Transformers

ArabicOpen Source License:Apache-2.0 #Arabic generation #GPT-2 architecture #Scientific literature optimization

Downloads 114

Release Time : 12/31/2023

Model Overview

This model is a dedicated GPT-2 model under the ArabianLLM project, focusing on improving the natural language modeling and generation capabilities of Arabic, and specifically optimized for the language characteristics and nuances of Arabic.

Model Features

Arabic optimization

Specifically optimized for the complex language characteristics and nuances of Arabic

Large-scale training

Trained using a 23GB Arabic dataset, including scientific literature and general text

High-performance architecture

Based on the GPT-2 architecture, with 24 layers and 16 model attention layers

Long context support

Supports a context window of 1024 tokens

Model Capabilities

Arabic text generation

Arabic text understanding

Arabic content creation

Use Cases

Content generation

News content generation

Generate Arabic news content based on prompts

Creative writing

Assist in Arabic creative text writing

Educational research

Arabic research

Support Arabic linguistics research

🚀 ArabianGPT Model Overview

ArabianGPT-0.3B is a specialized GPT - 2 model optimized for Arabic language modeling, developed to address the unique linguistic challenges of Arabic.

🚀 Quick Start

You can use this pre - trained, native Arabic language model as an experimental tool. Here is an example of using it with the Transformers Pipeline:

from transformers import pipeline

pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-03B", max_new_tokens=512)
text = ''
pipe.predict(text)

✨ Features

Architecture: GPT - 2
Model Size: 345 million parameters
Layers: 24
Model Attention Layers (MAL): 16
Context Window Size: 1024 tokens

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import pipeline

pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-03B", max_new_tokens=512)
text = ''
pipe.predict(text)

Advanced Usage

No advanced usage examples are provided in the original document, so this part is skipped.

📚 Documentation

Introduction

ArabianGPT - 0.3B, developed under the ArabianLLM initiatives, is a specialized GPT - 2 model optimized for Arabic language modeling. It's a product of the collaborative efforts at Prince Sultan University's Robotics and Internet of Things Lab, focusing on enhancing natural language modeling and generation in Arabic. This model represents a significant stride in LLM research, specifically addressing the linguistic complexities and nuances of the Arabic language.

How to Use the Pre - Trained Model

You are invited to utilize this pre - trained, native Arabic language model as an experimental tool to assess its capabilities, aid in its fine - tuning, and evaluate its performance across a variety of downstream tasks. We encourage you to review our technical report for a comprehensive understanding of the model's performance metrics and the specific downstream tasks it has been tested on. This will provide valuable insights into its applicability and effectiveness in diverse applications.

Role in ArabianLLM Initiatives

ArabianGPT - 0.3B is crucial for advancing Arabic language processing, addressing challenges unique to Arabic morphology and dialects.

Limitations and Ethical Considerations

The model may have context understanding or text generation limitations in certain scenarios.
Emphasis on ethical use to prevent misinformation or harmful content propagation.

Acknowledgments

Special thanks to Prince Sultan University, particularly the Robotics and Internet of Things Lab.

Contact Information

For inquiries: riotu@psu.edu.sa.

🔧 Technical Details

Training

Dataset: Scraped texts contains scientific articles, and general texts
Data Size: 23 GB
Tokenizer: Aranizer 64K
Tokens: Over 3.3 billion
Hardware: 4 NDIVIA A100 GPUs
Training Duration: 45 days
Performance: loss of 3.82

📄 License

The model is licensed under the Apache - 2.0 license.

⚠️ Important Note

We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT - 0.3B, and users engage with and apply the model's outputs at their own risk.

⚠️ Important Note

Currently, we offer a raw pre - trained model. Our team is actively working on releasing instruction - based LLMs that are fine - tuned and augmented with LRHF. The first set of pre - trained models has been made available for community exploration. While we do have models fine - tuned for specific tasks such as summarization and sentiment analysis, they are still in the development phase.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご