đ XLM-ROBERTA-BASE-XNLI-ZH
This model is designed for zero-shot text classification in hate speech detection, leveraging the XLM-Roberta-base architecture and fine-tuned on Chinese data.
đ Quick Start
Usage with Zero-Shot Classification pipeline
from transformers import pipeline
classifier = pipeline("zero-shot-classification",
model="morit/chinese_xlm_xnli")
⨠Features
- This model is based on the XLM-Roberta-base model, which has been continuously pre-trained on a large corpus of multilingual Twitter data.
- It is developed following a strategy similar to that in the Tweet Eval framework.
- The model is further fine-tuned on the German part of the XNLI training dataset and is focused on Chinese due to fine-tuning on Chinese data. Since the base model is pre-trained on 100 languages, it also shows effectiveness in other languages.
đ Documentation
Intended Usage
This model is developed for Zero-Shot Text Classification in the field of Hate Speech Detection. It focuses on the Chinese language as it is fine-tuned on Chinese data. Given that the base model is pre-trained on 100 different languages, it has shown some effectiveness in other languages. Please refer to the list of languages in the XLM Roberta paper.
Training
This model is pre-trained on a set of 100 languages and then further trained on 198M multilingual tweets as described in the original paper. Additionally, it is trained on the Chinese training set of the XNLI dataset, which is a machine-translated version of the MNLI dataset. It is trained for 5 epochs on the XNLI train set, and evaluated on the XNLI eval dataset at the end of each epoch to select the best-performing model.

- Learning rate: 2e-5
- Batch size: 32
- Max sequence length: 128
The training is conducted using a GPU (NVIDIA GeForce RTX 3090), resulting in a training time of 1 hour and 47 minutes.
Evaluation
The best-performing model is evaluated on the XNLI test set to obtain comparable results.
predict_accuracy = 76.17 %
đ License
This model is released under the MIT license.
Property |
Details |
Model Type |
XLM-ROBERTA-BASE-XNLI-ZH |
Training Data |
Pre-trained on 100 languages, further trained on 198M multilingual tweets, and fine-tuned on the Chinese XNLI dataset |
Metrics |
Accuracy |
Pipeline Tag |
Zero-Shot Classification |
Datasets |
XNLI |
Language |
Chinese |