B

Bert Base Arabertv02

Developed by aubmindlab
AraBERT is an Arabic pre-trained language model based on the BERT architecture, specifically optimized for Arabic language understanding tasks.
Downloads 666.17k
Release Time : 3/2/2022

Model Overview

AraBERT is a pre-trained language model designed for Arabic, based on the BERT architecture, and performs well on various Arabic NLP tasks, including sentiment analysis, named entity recognition, and question-answering systems.

Model Features

Arabic optimization
Specifically optimized for Arabic language features, including handling the unique prefix and suffix tokenization issues in Arabic.
Pre-segmentation processing
Use the Farasa tokenizer to perform pre-segmentation processing on text to improve the model's understanding ability.
Large-scale training data
Trained using Arabic data of over 200M sentences (8.6B words).
Multi-version support
Provide a basic version, a large version, and a special version for Twitter data.

Model Capabilities

Arabic text understanding
Sentiment analysis
Named entity recognition
Question-answering system
Text filling

Use Cases

Sentiment analysis
Sentiment analysis of Arabic comments
Analyze the sentiment tendency of Arabic social media comments or product reviews.
Performs better than mBERT on multiple Arabic sentiment analysis datasets.
Named entity recognition
Entity recognition in Arabic text
Identify entities such as person names and place names in Arabic text.
Achieves good results on the ANERcorp dataset.
Question-answering system
Arabic reading comprehension
Answer questions based on Arabic articles.
Performs well on the Arabic-SQuAD and ARCD datasets.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase