đ ArabianGPT Model Overview
ArabianGPT-0.3B is a specialized GPT - 2 model optimized for Arabic language modeling, developed to address the unique linguistic challenges of Arabic.
đ Quick Start
You can use this pre - trained, native Arabic language model as an experimental tool. Here is an example of using it with the Transformers Pipeline:
from transformers import pipeline
pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-03B", max_new_tokens=512)
text = ''
pipe.predict(text)
⨠Features
- Architecture: GPT - 2
- Model Size: 345 million parameters
- Layers: 24
- Model Attention Layers (MAL): 16
- Context Window Size: 1024 tokens
đĻ Installation
No installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
Basic Usage
from transformers import pipeline
pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-03B", max_new_tokens=512)
text = ''
pipe.predict(text)
Advanced Usage
No advanced usage examples are provided in the original document, so this part is skipped.
đ Documentation
Introduction
ArabianGPT - 0.3B, developed under the ArabianLLM initiatives, is a specialized GPT - 2 model optimized for Arabic language modeling. It's a product of the collaborative efforts at Prince Sultan University's Robotics and Internet of Things Lab, focusing on enhancing natural language modeling and generation in Arabic. This model represents a significant stride in LLM research, specifically addressing the linguistic complexities and nuances of the Arabic language.
How to Use the Pre - Trained Model
You are invited to utilize this pre - trained, native Arabic language model as an experimental tool to assess its capabilities, aid in its fine - tuning, and evaluate its performance across a variety of downstream tasks. We encourage you to review our technical report for a comprehensive understanding of the model's performance metrics and the specific downstream tasks it has been tested on. This will provide valuable insights into its applicability and effectiveness in diverse applications.
Role in ArabianLLM Initiatives
ArabianGPT - 0.3B is crucial for advancing Arabic language processing, addressing challenges unique to Arabic morphology and dialects.
Limitations and Ethical Considerations
- The model may have context understanding or text generation limitations in certain scenarios.
- Emphasis on ethical use to prevent misinformation or harmful content propagation.
Acknowledgments
Special thanks to Prince Sultan University, particularly the Robotics and Internet of Things Lab.
Contact Information
For inquiries: riotu@psu.edu.sa.
đ§ Technical Details
Training
- Dataset: Scraped texts contains scientific articles, and general texts
- Data Size: 23 GB
- Tokenizer: Aranizer 64K
- Tokens: Over 3.3 billion
- Hardware: 4 NDIVIA A100 GPUs
- Training Duration: 45 days
- Performance: loss of 3.82
đ License
The model is licensed under the Apache - 2.0 license.
â ī¸ Important Note
We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT - 0.3B, and users engage with and apply the model's outputs at their own risk.
â ī¸ Important Note
Currently, we offer a raw pre - trained model. Our team is actively working on releasing instruction - based LLMs that are fine - tuned and augmented with LRHF. The first set of pre - trained models has been made available for community exploration. While we do have models fine - tuned for specific tasks such as summarization and sentiment analysis, they are still in the development phase.