bigbird-base-trivia-itc Open-Source Q&A Model - Optimize Trivia Q&A, Support Long Sequence Processing

Bigbird Base Trivia Itc

Developed by google

A fine-tuned model based on bigbird-roberta-base, optimized for trivia QA tasks with long sequence processing support.

Question Answering System EnglishOpen Source License:Apache-2.0 #Long Sequence QA #Sparse Attention #Knowledge QA

Downloads 1,049

Release Time : 3/2/2022

Model Overview

This model is a QA model based on the BigBird architecture, fine-tuned on the trivia_qa dataset, particularly suitable for QA tasks requiring long-context understanding.

Model Features

Long Sequence Processing

Supports sequences up to 4096 tokens, ideal for long-text QA tasks.

Flexible Attention Mechanism

Supports both block_sparse and original_full attention modes for customizable performance.

Efficient Computation

Achieves efficient computation through random tokens and windowed attention mechanisms.

Model Capabilities

Text QA

Long-Text Understanding

Fact Retrieval

Use Cases

Knowledge QA

Encyclopedia QA

Answering various knowledge-based questions

Performs well on the trivia_qa dataset

Fact Verification

Extracting factual information from long texts for verification

🚀 BigBird base trivia-itc

This model is a fine - tuned checkpoint of bigbird - roberta - base. It is fine - tuned on trivia_qa with BigBirdForQuestionAnsweringHead on top. It provides an effective solution for question - answering tasks.

Check out this to see how well google/bigbird-base-trivia-itc performs on question answering.

🚀 Quick Start

This section will guide you on how to use the model effectively.

✨ Features

Based on the fine - tuned bigbird - roberta - base model.
Fine - tuned on the trivia_qa dataset for question - answering tasks.
Allows for different attention types, block sizes, and numbers of random blocks.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

from transformers import BigBirdForQuestionAnswering

# by default its in `block_sparse` mode with num_random_blocks=3, block_size=64
model = BigBirdForQuestionAnswering.from_pretrained("google/bigbird-base-trivia-itc")

question = "Replace me by any text you'd like."
context = "Put some context for answering"
encoded_input = tokenizer(question, context, return_tensors='pt')
output = model(**encoded_input)

Advanced Usage

# you can change `attention_type` to full attention like this:
model = BigBirdForQuestionAnswering.from_pretrained("google/bigbird-base-trivia-itc", attention_type="original_full")

# you can change `block_size` & `num_random_blocks` like this:
model = BigBirdForQuestionAnswering.from_pretrained("google/bigbird-base-trivia-itc", block_size=16, num_random_blocks=2)

question = "Replace me by any text you'd like."
context = "Put some context for answering"
encoded_input = tokenizer(question, context, return_tensors='pt')
output = model(**encoded_input)

📚 Documentation

Fine - tuning config & hyper - parameters

Property	Details
No. of global token	128
Window length	192
No. of random token	192
Max. sequence length	4096
No. of heads	12
No. of hidden layers	12
Hidden layer size	768
Batch size	32
Loss	cross - entropy noisy spans

🔧 Technical Details

The model is a fine - tuned version of bigbird - roberta - base on the trivia_qa dataset. It uses BigBirdForQuestionAnsweringHead for question - answering tasks. Different attention types, block sizes, and numbers of random blocks can be configured to optimize performance.

📄 License

This project is licensed under the Apache - 2.0 license.

BibTeX entry and citation info

@misc{zaheer2021big,
      title={Big Bird: Transformers for Longer Sequences}, 
      author={Manzil Zaheer and Guru Guruganesh and Avinava Dubey and Joshua Ainslie and Chris Alberti and Santiago Ontanon and Philip Pham and Anirudh Ravula and Qifan Wang and Li Yang and Amr Ahmed},
      year={2021},
      eprint={2007.14062},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご