A

Albert Base Arabic

Developed by asafaya
Arabic ALBERT Base is a pretrained language model trained on approximately 4.4 billion words of Arabic data, supporting Modern Standard Arabic and some dialectal content.
Downloads 35
Release Time : 3/2/2022

Model Overview

This model is an Arabic pretrained language model based on the ALBERT architecture, suitable for natural language processing tasks such as text classification and named entity recognition.

Model Features

Multi-Source Data Training
The model is trained on OSCAR Arabic and Wikipedia data, covering Modern Standard Arabic and some dialectal content.
Optimized Training Parameters
Adjusted training steps and batch size, using 7 million training steps (batch size=64) to improve performance.
Retention of Non-Arabic Vocabulary
Non-Arabic vocabulary in sentences is preserved during preprocessing to enhance the effectiveness of tasks such as NER.

Model Capabilities

Text Classification
Named Entity Recognition
Language Modeling

Use Cases

Natural Language Processing
Named Entity Recognition
Identify named entities in Arabic text, such as person names and locations.
Text Classification
Classify Arabic text, such as sentiment analysis and topic classification.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase