BORT Open-Source High-Efficiency Text Understanding Model - 10 Times Faster Inference Speed, Outperforming Some Uncompressed Models in Performance

Bort

Developed by amazon

BORT is a highly compressed version of BERT-large, optimized through neural architecture search technology, achieving up to 10x faster inference speed while outperforming some uncompressed models.

Large Language Model #Efficient Inference #Neural Architecture Search #NLU Optimization

Downloads 201

Release Time : 3/2/2022

Model Overview

BORT is an optimal subset of the BERT-large architecture, compressed via neural architecture search technology. It is primarily used for natural language understanding tasks, offering efficient inference speed and excellent performance.

Model Features

High Compression

BORT is a highly compressed version of BERT-large, with an effective size of only 5.5% and a volume of 16% compared to the original model.

Fast Inference

Achieves 7.9x faster inference speed on CPUs and is 10x faster than BERT-large.

High Performance

Outperforms BERT-large and other compressed variants in multiple NLU benchmarks, with improvements ranging from 0.3% to 31%.

Low Training Cost

Requires only 288 GPU hours for pre-training, significantly less than RoBERTa-large and BERT-large.

Model Capabilities

Natural Language Understanding

Text Classification

Question Answering Systems

Named Entity Recognition

Use Cases

Natural Language Processing

Text Classification

Used for classifying text, such as sentiment analysis and topic classification.

Outperforms BERT-large and other compressed variants.

Question Answering Systems

Used to build efficient question-answering systems for quick response to user queries.

Inference speed improved by 7.9x.

🚀 BORT

BORT is a highly compressed and efficient version of the BERT model, offering significantly faster inference speeds.

🚀 Quick Start

This section is not provided in the original document, so it is skipped.

✨ Features

Highly Compressed: BORT is a highly compressed version of bert-large, with an effective size of only 5.5% of the original BERT - large architecture (excluding the embedding layer) and 16% of the net size.
Fast Inference: It is up to 10 times faster at inference and 7.9x faster on a CPU compared to the original model.
Efficient Pretraining: BORT can be pretrained in 288 GPU hours, which is 1.2% of the time required to pretrain RoBERTa - large and about 33% of that of BERT - large on the same hardware.
Good Performance: It outperforms other compressed and some non - compressed variants of the architecture, achieving performance improvements of between 0.3% and 31% on multiple public natural language understanding (NLU) benchmarks compared to BERT - large.

📚 Documentation

Model Introduction

Amazon's BORT is an optimal sub - architecture of bert - large found using neural architecture search.

Paper

The research paper about BORT can be found here.

Abstract We extract an optimal subset of architectural parameters for the BERT architecture from Devlin et al. (2018) by applying recent breakthroughs in algorithms for neural architecture search. This optimal subset, which we refer to as "Bort", is demonstrably smaller, having an effective (that is, not counting the embedding layer) size of 5.5% the original BERT - large architecture, and 16% of the net size. Bort is also able to be pretrained in 288 GPU hours, which is 1.2% of the time required to pretrain the highest - performing BERT parametric architectural variant, RoBERTa - large (Liu et al., 2019), and about 33% of that of the world - record, in GPU hours, required to train BERT - large on the same hardware. It is also 7.9x faster on a CPU, as well as being better performing than other compressed variants of the architecture, and some of the non - compressed variants: it obtains performance improvements of between 0.3% and 31%, absolute, with respect to BERT - large, on multiple public natural language understanding (NLU) benchmarks.

The original model can be found under: https://github.com/alexa/bort

Important Note

⚠️ Important Note

This model is community - contributed, and not supported by Amazon, Inc.

⚠️ Important Note

BORT requires a very unique fine - tuning algorithm, called Agora which is not open - sourced yet. Standard fine - tuning has not shown to work well in initial experiments, so stay tuned for updates!

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご