Bort
BORT is a highly compressed version of BERT-large, optimized through neural architecture search technology, achieving up to 10x faster inference speed while outperforming some uncompressed models.
Downloads 201
Release Time : 3/2/2022
Model Overview
BORT is an optimal subset of the BERT-large architecture, compressed via neural architecture search technology. It is primarily used for natural language understanding tasks, offering efficient inference speed and excellent performance.
Model Features
High Compression
BORT is a highly compressed version of BERT-large, with an effective size of only 5.5% and a volume of 16% compared to the original model.
Fast Inference
Achieves 7.9x faster inference speed on CPUs and is 10x faster than BERT-large.
High Performance
Outperforms BERT-large and other compressed variants in multiple NLU benchmarks, with improvements ranging from 0.3% to 31%.
Low Training Cost
Requires only 288 GPU hours for pre-training, significantly less than RoBERTa-large and BERT-large.
Model Capabilities
Natural Language Understanding
Text Classification
Question Answering Systems
Named Entity Recognition
Use Cases
Natural Language Processing
Text Classification
Used for classifying text, such as sentiment analysis and topic classification.
Outperforms BERT-large and other compressed variants.
Question Answering Systems
Used to build efficient question-answering systems for quick response to user queries.
Inference speed improved by 7.9x.
Featured Recommended AI Models