Vit Base Patch16 224 Finetuned Cifar10
This is an image classification model based on the Vision Transformer (ViT) architecture, fine-tuned on the CIFAR10 dataset, achieving 98.76% accuracy.
Downloads 15
Release Time : 12/7/2022
Model Overview
This model is a fine-tuned version of Google's original ViT model on the CIFAR10 dataset, specifically designed for 10-class image classification tasks.
Model Features
High accuracy
Achieves 98.76% classification accuracy on the CIFAR10 test set
Transformer-based architecture
Utilizes Vision Transformer architecture instead of traditional CNN
Small image processing capability
Although the original ViT was designed for 224x224 images, it has been optimized for CIFAR10's 32x32 small images
Model Capabilities
Image classification
Feature extraction
Use Cases
Computer vision
Object recognition
Recognize 10 common object categories in the CIFAR10 dataset
98.76% accuracy
Educational demonstration
Used for teaching demonstrations of Transformer applications in visual tasks
Featured Recommended AI Models
Š 2025AIbase