Chinese_XLM_XNLI Open-source Model - Focus on Zero-shot Text Classification in Chinese, Support Free Deployment

Chinese Xlm Xnli

Developed by morit

Based on the XLM-Roberta-base model, it underwent continuous pre-training on a large multilingual Twitter corpus and was fine-tuned on the Chinese XNLI dataset, focusing on zero-shot text classification in Chinese.

Large Language Model

Transformers

ChineseOpen Source License:MIT #Zero-shot classification #Multilingual support #Chinese text classification

Downloads 19

Release Time : 12/29/2022

Model Overview

This model was developed for zero-shot text classification in the field of hate speech detection, with a focus on Chinese language, but has also shown some effectiveness in other languages.

Model Features

Multilingual pre-training

The model was pre-trained on 100 languages, enabling cross-lingual understanding.

Chinese fine-tuning

Fine-tuned on the Chinese XNLI dataset, focusing on zero-shot text classification in Chinese.

Zero-shot classification

Supports zero-shot text classification tasks, allowing classification without task-specific training data.

Model Capabilities

Zero-shot text classification

Multilingual understanding

Chinese text processing

Use Cases

Text classification

Hate speech detection

Used to detect hate speech content in Chinese text.

🚀 XLM-ROBERTA-BASE-XNLI-ZH

This model is designed for zero-shot text classification in hate speech detection, leveraging the XLM-Roberta-base architecture and fine-tuned on Chinese data.

🚀 Quick Start

Usage with Zero-Shot Classification pipeline

from transformers import pipeline
classifier = pipeline("zero-shot-classification",
                      model="morit/chinese_xlm_xnli")

✨ Features

This model is based on the XLM-Roberta-base model, which has been continuously pre-trained on a large corpus of multilingual Twitter data.
It is developed following a strategy similar to that in the Tweet Eval framework.
The model is further fine-tuned on the German part of the XNLI training dataset and is focused on Chinese due to fine-tuning on Chinese data. Since the base model is pre-trained on 100 languages, it also shows effectiveness in other languages.

📚 Documentation

Intended Usage

This model is developed for Zero-Shot Text Classification in the field of Hate Speech Detection. It focuses on the Chinese language as it is fine-tuned on Chinese data. Given that the base model is pre-trained on 100 different languages, it has shown some effectiveness in other languages. Please refer to the list of languages in the XLM Roberta paper.

Training

This model is pre-trained on a set of 100 languages and then further trained on 198M multilingual tweets as described in the original paper. Additionally, it is trained on the Chinese training set of the XNLI dataset, which is a machine-translated version of the MNLI dataset. It is trained for 5 epochs on the XNLI train set, and evaluated on the XNLI eval dataset at the end of each epoch to select the best-performing model.

Training Charts from wandb

Learning rate: 2e-5
Batch size: 32
Max sequence length: 128

The training is conducted using a GPU (NVIDIA GeForce RTX 3090), resulting in a training time of 1 hour and 47 minutes.

Evaluation

The best-performing model is evaluated on the XNLI test set to obtain comparable results.

predict_accuracy = 76.17 %

📄 License

This model is released under the MIT license.

Property	Details
Model Type	XLM-ROBERTA-BASE-XNLI-ZH
Training Data	Pre-trained on 100 languages, further trained on 198M multilingual tweets, and fine-tuned on the Chinese XNLI dataset
Metrics	Accuracy
Pipeline Tag	Zero-Shot Classification
Datasets	XNLI
Language	Chinese

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご