xlnet-base-cased Open-source English Pre-trained Model - Achieve Outstanding Results for Multilingual Tasks

Xlnet Base Cased

Developed by xlnet

XLNet is a model pre-trained on English language using generalized permutation language modeling objectives and Transformer-XL architecture, achieving SOTA results on multiple language tasks.

Large Language Model EnglishOpen Source License:MIT #Generalized Autoregressive Pretraining #Long Context Processing #Multi-task Fine-tuning

Downloads 166.60k

Release Time : 3/2/2022

Model Overview

XLNet is an unsupervised language representation learning method based on a novel generalized permutation language modeling objective, utilizing Transformer-XL as its backbone model, excelling in long-context language tasks.

Model Features

Generalized Permutation Language Modeling

Employs a novel generalized permutation language modeling objective, overcoming the limitations of traditional autoregressive models.

Long Context Processing Capability

Based on the Transformer-XL architecture, it excels in handling long-context language tasks.

Multi-task SOTA Performance

Achieves state-of-the-art results on various downstream tasks such as question answering, natural language inference, and sentiment analysis.

Model Capabilities

Text Understanding

Sequence Classification

Token Classification

Question Answering

Use Cases

Natural Language Processing

Sentiment Analysis

Analyzes the sentiment polarity of text

Achieves SOTA results on standard datasets

Question Answering System

Question answering applications based on text content

Performs excellently in multiple QA tasks

🚀 XLNet (base-sized model)

XLNet is a pre - trained English language model. It addresses the challenges of language understanding and representation, offering state - of - the - art performance on various downstream language tasks.

🚀 Quick Start

XLNet is a pre - trained English language model. It was introduced in the paper XLNet: Generalized Autoregressive Pretraining for Language Understanding by Yang et al. and first released in this repository.

Disclaimer: The team releasing XLNet did not write a model card for this model, so this model card has been written by the Hugging Face team.

✨ Features

Novel Training Objective: XLNet is based on a novel generalized permutation language modeling objective for unsupervised language representation learning.
Powerful Backbone: It employs Transformer - XL as the backbone model, which performs well on language tasks with long - context requirements.
SOTA Performance: Achieves state - of - the - art results on various downstream language tasks such as question answering, natural language inference, sentiment analysis, and document ranking.

📦 Installation

No specific installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import XLNetTokenizer, XLNetModel

tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased')
model = XLNetModel.from_pretrained('xlnet-base-cased')

inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)

last_hidden_states = outputs.last_hidden_state

Advanced Usage

No advanced usage examples are provided in the original document, so this part is skipped.

📚 Documentation

Model description

XLNet is a new unsupervised language representation learning method based on a novel generalized permutation language modeling objective. Additionally, XLNet employs Transformer - XL as the backbone model, exhibiting excellent performance for language tasks involving long context. Overall, XLNet achieves state - of - the - art (SOTA) results on various downstream language tasks including question answering, natural language inference, sentiment analysis, and document ranking.

Intended uses & limitations

The model is mostly intended to be fine - tuned on a downstream task. See the model hub to look for fine - tuned versions on a task that interests you.

Note that this model is primarily aimed at being fine - tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering. For tasks such as text generation, you should look at models like GPT2.

🔧 Technical Details

No specific technical details beyond the model description are provided in the original document, so this section is skipped.

📄 License

This model is released under the MIT license.

BibTeX entry and citation info

@article{DBLP:journals/corr/abs-1906-08237,
  author    = {Zhilin Yang and
               Zihang Dai and
               Yiming Yang and
               Jaime G. Carbonell and
               Ruslan Salakhutdinov and
               Quoc V. Le},
  title     = {XLNet: Generalized Autoregressive Pretraining for Language Understanding},
  journal   = {CoRR},
  volume    = {abs/1906.08237},
  year      = {2019},
  url       = {http://arxiv.org/abs/1906.08237},
  eprinttype = {arXiv},
  eprint    = {1906.08237},
  timestamp = {Mon, 24 Jun 2019 17:28:45 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-1906-08237.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Property	Details
Model Type	XLNet (base - sized model)
Training Data	BookCorpus, Wikipedia

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご