D

Darijabert Arabizi

Developed by SI2M-Lab
The first BERT model specifically designed for the Moroccan Arabic dialect 'Darija', supporting processing of Darija dialect texts written in Latin script
Downloads 300
Release Time : 3/2/2022

Model Overview

DarijaBERT is an open-source model jointly released by AIOX Labs and SI2M Lab at INSEA, specifically designed for understanding and processing text content in the Moroccan dialect 'Darija'. The model is based on the BERT-base architecture with the next sentence prediction objective removed, and is optimized for social media text.

Model Features

Dialect-specific
The first BERT model specifically designed for the Moroccan dialect Darija
Arabizi support
Specially optimized for processing Darija dialect texts written in Latin script
Trained on social media data
Training data sourced from YouTube comments, suitable for processing informal text

Model Capabilities

Moroccan dialect text understanding
Dialect text feature extraction
Social media text processing

Use Cases

Natural Language Processing
Dialect social media analysis
Analyzing Moroccan dialect social media comments
Effectively understands informal dialect expressions
Dialect text classification
Classification tasks for Darija dialect texts
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase