vit-base-patch16-224-int8-static-inc Open Source Model - Fine-tuning ViT with Small Size and High Accuracy

Vit Base Patch16 224 Int8 Static Inc

Developed by Intel

This is an INT8 PyTorch model statically quantized using Intel® Neural Compressor post-training, based on Google's ViT model fine-tuning, significantly reducing model size while maintaining high accuracy.

Image Classification

Transformers

Open Source License:Apache-2.0 #INT8 quantization #Image classification #High compression ratio

Downloads 82

Release Time : 9/6/2022

Model Overview

This model is a quantized version of Vision Transformer (ViT), suitable for image classification tasks, specifically optimized for the imagenet-1k dataset.

Model Features

Efficient quantization

Post-training static quantization using Intel® Neural Compressor, compressing the model from FP32 to INT8, reducing volume by approximately 71%

Precision control

Selectively reverting specific linear modules to FP32 precision, ensuring accuracy loss is controlled within 1%

Optimized calibration

Using training set data loader for calibration, default sampling of 1000 samples (corresponding to 1000 classes)

Model Capabilities

Image classification

Efficient inference

Low memory footprint

Use Cases

Computer vision

Image classification system

Can be used to build efficient image classification systems, especially for general image classification of 1000 categories

Achieves 80.576% accuracy on imagenet-1k

Edge device deployment

Suitable for deployment on resource-constrained edge devices for image classification tasks

Model size is only 94MB, much smaller than the original FP32 model

🚀 The INT8 model based on vit-base-patch16-224 which finetuned on imagenet-1k

This project provides an INT8 PyTorch model based on vit - base - patch16 - 224, finetuned on imagenet - 1k, offering high - performance and efficient image recognition capabilities.

🚀 Quick Start

Load with Intel® Neural Compressor

from neural_compressor.utils.load_huggingface import OptimizedModel
int8_model = OptimizedModel.from_pretrained(
    'Intel/vit-base-patch16-224-int8-static',
)

✨ Features

INT8 Quantization: This is an INT8 PyTorch model quantized with Intel® Neural Compressor.
Fine - tuned on imagenet - 1k: The original fp32 model comes from the fine - tuned model [google/vit - base - patch16 - 224](https://huggingface.co/google/vit - base - patch16 - 224).
Calibration Strategy: The calibration dataloader is the train dataloader. The default calibration sampling size is 1000 because of 1000 classes of imagenet - 1k.
Module Fallback: The linear modules vit.encoder.layer.5.output.dense, vit.encoder.layer.9.attention.attention.query.module, fall back to fp32 for less than 1% relative accuracy loss.

📚 Documentation

Post - training static quantization

This is an INT8 PyTorch model quantized with Intel® Neural Compressor. The original fp32 model comes from the fine - tuned model [google/vit - base - patch16 - 224](https://huggingface.co/google/vit - base - patch16 - 224). The calibration dataloader is the train dataloader. The default calibration sampling size 1000 because of 1000 classes of imagenet - 1k. The linear modules vit.encoder.layer.5.output.dense, vit.encoder.layer.9.attention.attention.query.module, fall back to fp32 for less than 1% relative accuracy loss.

Evaluation result

	INT8	FP32
Accuracy (eval - acc)	80.576	81.326
Model size (MB)	94	331

📄 License

This project is licensed under the Apache - 2.0 license.

📦 Information Table

Property	Details
Model Type	INT8 model based on vit - base - patch16 - 224 finetuned on imagenet - 1k
Training Data	imagenet - 1k
Metrics	accuracy
Tags	int8, Intel® Neural Compressor, PostTrainingStatic

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご