I

Ibert Roberta Large

Developed by kssteven
I-BERT is a pure integer-quantized version of RoBERTa-large, using INT8 to store parameters and integer operations for inference, achieving up to 4x inference acceleration.
Downloads 45
Release Time : 3/2/2022

Model Overview

An integer-quantized model based on the RoBERTa architecture, designed for efficient inference, suitable for tasks requiring fast text processing.

Model Features

Pure integer operations
All parameters are stored in INT8 format, executing inference entirely with integer operations, eliminating the need for floating-point computation units.
Quantization-aware training
Supports a three-stage fine-tuning process (full precision → quantization → integer fine-tuning) to maximize post-quantization accuracy.
4x inference acceleration
Achieves up to 4x inference speed improvement compared to the floating-point version on Nvidia T4 GPUs.

Model Capabilities

Text classification
Semantic understanding
Efficient inference

Use Cases

Text processing
Semantic similarity judgment
E.g., sentence pair similarity classification in MRPC tasks
Maintains accuracy close to the full-precision model after quantization.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase