B

Bert Large Arabertv2

Developed by aubmindlab
AraBERT is a pre-trained language model based on Google's BERT architecture, specifically designed for Arabic natural language understanding tasks.
Downloads 334
Release Time : 3/2/2022

Model Overview

AraBERT is a BERT model optimized for Arabic, improving performance in Arabic NLP tasks through enhanced preprocessing and training on larger datasets.

Model Features

Improved Preprocessing
Resolved issues with punctuation and numbers sticking to words by inserting spaces, optimizing tokenization.
Larger Training Data
Used approximately 3.5 times more data, including Wikipedia and OSCAR corpus, to enhance model performance.
Multi-version Support
Offers base and large versions, as well as variants for Twitter data, to meet diverse needs.

Model Capabilities

Arabic Text Understanding
Sentiment Analysis
Named Entity Recognition
Question Answering System

Use Cases

Sentiment Analysis
Social Media Sentiment Analysis
Analyze sentiment tendencies in Arabic social media content.
Performs excellently on datasets like HARD and ASTD-Balanced.
Named Entity Recognition
News Entity Recognition
Identify named entities from Arabic news.
Evaluated based on the ANERcorp dataset.
Question Answering System
Arabic Question Answering
Answer questions based on Arabic text.
Evaluated on Arabic-SQuAD and ARCD datasets.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase