đ Randeng-T5-77M
A Chinese version of mT5-small, excelling in handling NLT tasks.
đ Quick Start
Prerequisites
Make sure you have installed the transformers
and torch
libraries.
Installation
pip install transformers torch
Usage Example
from transformers import T5ForConditionalGeneration, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained('IDEA-CCNL/Randeng-T5-77M', use_fast=False)
model = T5ForConditionalGeneration.from_pretrained('IDEA-CCNL/Randeng-T5-77M')
⨠Features
- Chinese Adaptation: Specifically tailored for Chinese language processing, making it highly effective in Chinese NLT tasks.
- Efficient Training: Utilizes Corpus-Adaptive Pre-Training (CAPT) on the WuDao Corpora (180 GB version) to accelerate the training process.
đĻ Installation
pip install transformers torch
đģ Usage Examples
Basic Usage
from transformers import T5ForConditionalGeneration, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained('IDEA-CCNL/Randeng-T5-77M', use_fast=False)
model = T5ForConditionalGeneration.from_pretrained('IDEA-CCNL/Randeng-T5-77M')
input_text = "åäēŦææ äš
į <extra_id_0>å <extra_id_1>ã"
input_ids = tokenizer(input_text, return_tensors='pt').input_ids
outputs = model.generate(input_ids)
output_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(output_text)
đ Documentation
Model Taxonomy
Property |
Details |
Demand |
General |
Task |
Natural Language Transformation (NLT) |
Series |
Randeng |
Model |
mT5 |
Parameter |
77M |
Extra |
Chinese |
Model Information
Based on mT5-small, we implement its Chinese version. In order to accelerate training, we only retrain the vocabulary and embedding corresponding to Chinese and English in T5tokenizer (sentence piece), and Corpus-Adaptive Pre-Training (CAPT) on the WuDao Corpora (180 GB version). The pretraining objective is span corruption. Specifically, we use the fengshen framework in the pre-training phase which cost about 24 hours with 8 A100 GPUs.
đ License
This project is licensed under the Apache-2.0 license.
đ Citation
If you are using the resource for your work, please cite the our paper:
@article{fengshenbang,
author = {Jiaxing Zhang and Ruyi Gan and Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen},
title = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
journal = {CoRR},
volume = {abs/2209.02970},
year = {2022}
}
You can also cite our website:
@misc{Fengshenbang-LM,
title={Fengshenbang-LM},
author={IDEA-CCNL},
year={2021},
howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}