Splinter-base Open-source Question-Answering Model - Free and Excellent, Tailored for Small-sample Question-Answering Tasks!

Splinter Base

Developed by tau

Splinter is a self-supervised pre-trained model specifically designed for few-shot QA tasks, utilizing the Recurring Span Selection (RSS) objective for pre-training.

Question Answering System

Transformers

EnglishOpen Source License:Apache-2.0 #Few-shot QA #Span Selection Pre-training #Extractive QA

Downloads 648

Release Time : 3/2/2022

Model Overview

Splinter is a model pre-trained in a self-supervised manner, specifically designed for few-shot QA tasks. It employs the Recurring Span Selection (RSS) objective for pre-training, simulating the span selection process in extractive QA.

Model Features

Self-supervised Pre-training

No manual annotation required; leverages large-scale public data for pre-training.

Recurring Span Selection (RSS)

Pre-trained by simulating the span selection process in extractive QA.

Few-shot Learning

Designed specifically for few-shot QA tasks, suitable for data-scarce scenarios.

Model Capabilities

Extractive QA

Few-shot Learning

Text Understanding

Use Cases

QA Systems

Few-shot Extractive QA

Achieves high-quality extractive QA with only a small amount of labeled data.

🚀 Splinter base model

Splinter-base is a pre - trained model for few - shot question answering. It was trained in a self - supervised way and can use publicly available data.

🚀 Quick Start

The Splinter - base model is the pre - trained model presented in the paper [Few - Shot Question Answering by Pretraining Span Selection](https://aclanthology.org/2021.acl - long.239/) (at ACL 2021). You can find its original repository here. Note that this model is case - sensitive.

⚠️ Important Note

This model doesn't contain the pre - trained weights for the QASS layer (see paper for details). So, the QASS layer is randomly initialized when loading the model. For the model with those weights, see [tau/splinter - base - qass](https://huggingface.co/tau/splinter - base - qass).

✨ Features

Trained in a self - supervised fashion for few - shot question answering.
Pretrained with the Recurring Span Selection (RSS) objective.
Defines the Question - Aware Span selection (QASS) layer for multiple predictions.

📚 Documentation

Model description

Splinter is a model pretrained in a self - supervised way for few - shot question answering. It was pretrained on raw texts only, without any human labeling. An automatic process was used to generate inputs and labels from the texts, allowing it to use a large amount of publicly available data.

More precisely, it was pretrained with the Recurring Span Selection (RSS) objective, which mimics the span selection process in extractive question answering. Given a text, clusters of recurring spans (n - grams that appear more than once in the text) are first identified. For each such cluster, all of its instances but one are replaced with a special [QUESTION] token, and the model should select the correct (i.e., unmasked) span for each masked one. The model also defines the Question - Aware Span selection (QASS) layer, which selects spans based on a specific question for multiple predictions.

Intended uses & limitations

The main use of this model is few - shot extractive QA.

Pretraining

The model was pretrained on a v3 - 8 TPU for 2.4M steps. The training data is based on Wikipedia and BookCorpus. See the paper for more details.

Property	Details
Model Type	Pretrained model for few - shot question answering
Training Data	Based on Wikipedia and BookCorpus

BibTeX entry and citation info

@inproceedings{ram-etal-2021-shot,
    title = "Few-Shot Question Answering by Pretraining Span Selection",
    author = "Ram, Ori  and
      Kirstain, Yuval  and
      Berant, Jonathan  and
      Globerson, Amir  and
      Levy, Omer",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-long.239",
    doi = "10.18653/v1/2021.acl-long.239",
    pages = "3066--3079",
}

📄 License

This model is licensed under the apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご