robbertje-1-gb-shuffled Open-source Dutch model - Distilled from RobBERT, available for training with a vast amount of text

Robbertje 1 Gb Shuffled

Developed by DTAI-KULeuven

RobBERTje is a shuffled version of the distilled Dutch model collection based on RobBERT, with 74M parameters, trained using shuffled OSCAR corpus

Large Language Model

Transformers

OtherOpen Source License:MIT #Dutch distilled model #Lightweight BERT #Multi-task adaptation

Downloads 508

Release Time : 3/2/2022

Model Overview

A distilled Dutch language model based on RobBERT, specifically optimized for training with shuffled corpus, suitable for various Dutch NLP tasks

Model Features

Corpus-optimized training

Trained with shuffled OSCAR corpus to enhance the model's understanding of non-continuous text

Efficient distillation

Retains over 90% of the teacher model's (RobBERT) performance through knowledge distillation while reducing parameters by 40%

Multi-task adaptation

Excellent performance on downstream tasks such as DBRD sentiment analysis and NER

Model Capabilities

Dutch text understanding

Masked language modeling

Sentiment analysis

Named entity recognition

Part-of-speech tagging

Natural language inference

Use Cases

Text analysis

News sentiment analysis

Analyze sentiment tendencies in Dutch news comments

Achieved 92.5% accuracy on DBRD dataset

Information extraction

Legal document processing

Extract key entities and relationships from legal documents

NER task F1 score reached 82.7

🚀 RobBERTje

RobBERTje is a collection of distilled Dutch BERT - based models, offering multiple options with different sizes and training settings for various use - cases.

RobBERTje: A collection of distilled Dutch BERT-based models

🚀 Quick Start

RobBERTje provides a range of distilled models based on RobBERT. You can choose from multiple models with different sizes and training settings according to your specific use - case. We are constantly working on releasing models with better performance. Keep an eye on the repository for updates.

📚 Documentation

News

February 21, 2022: Our paper about RobBERTje has been published in volume 11 of CLIN journal!
July 2, 2021: Publicly released 4 RobBERTje models.
May 12, 2021: RobBERTje was accepted at CLIN31 for an oral presentation!

The models

Property	Details
Model Type	Non - shuffled, Shuffled, Merged (p = 0.5), BORT
Description
- Non - shuffled: Trained on the non - shuffled variant of the oscar corpus, without any operations to preserve this order during training and distillation. - Shuffled: Trained on the publicly available and shuffled OSCAR corpus. - Merged (p = 0.5): Same as the non - shuffled variant, but sequential sentences of the same document are merged with a probability of 50%. - BORT: A smaller version with 8 attention heads instead of 12 and 4 layers instead of 6 (and 12 for RobBERT).
Parameters	Non - shuffled: 74 M; Shuffled: 74 M; Merged (p = 0.5): 74 M; BORT: 46 M
Training size	All models: 1 GB
Huggingface id
- Non - shuffled: [DTAI - KULeuven/robbertje - 1 - gb - non - shuffled](https://huggingface.co/DTAI - KULeuven/robbertje - 1 - gb - non - shuffled) - Shuffled: this model - Merged (p = 0.5): [DTAI - KULeuven/robbertje - 1 - gb - merged](https://huggingface.co/DTAI - KULeuven/robbertje - 1 - gb - merged) - BORT: [DTAI - KULeuven/robbertje - 1 - gb - bort](https://huggingface.co/DTAI - KULeuven/robbertje - 1 - gb - bort)

Results

Intrinsic results

We calculated the pseudo perplexity (PPPL) from cite, which is a built - in metric in our distillation library. This metric gives an indication of how well the model captures the input distribution.

Model	PPPL
RobBERT (teacher)	7.76
Non - shuffled	12.95
Shuffled	18.74
Merged (p = 0.5)	17.10
BORT	26.44

Extrinsic results

We also evaluated our models on several downstream tasks, just like the teacher model RobBERT. Since that evaluation, a Dutch NLI task named SICK - NL was also released and we evaluated our models with it as well.

Model	DBRD	DIE - DAT	NER	POS	SICK - NL
RobBERT (teacher)	94.4	99.2	89.1	96.4	84.2
Non - shuffled	90.2	98.4	82.9	95.5	83.4
Shuffled	92.5	98.2	82.7	95.6	83.4
Merged (p = 0.5)	92.9	96.5	81.8	95.2	82.8
BORT	89.6	92.2	79.7	94.3	81.0

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご