vit-base-patch16-224-in21k Open-source Image Classification Model - Free Deployment for Precise Image Classification

Home

Vit Base Patch16 224 In21k Finetuned Cifar10 Album Vitvmmrdb Make Model Album Pred

Developed by venetis

A Vision Transformer (ViT) based model fine-tuned on the CIFAR-10 dataset for image classification tasks

Image Classification

Transformers

Open Source License:Apache-2.0 #Image Classification #High Accuracy #ViT Architecture

Downloads 30

Release Time : 11/27/2022

Model Overview

This model is an image classification model based on Google's Vision Transformer (ViT) architecture, fine-tuned on the CIFAR-10 dataset, capable of accurately classifying 10 common object categories.

Model Features

High Accuracy

Achieves 85.72% accuracy on the CIFAR-10 test set

Transformer-based Architecture

Utilizes Vision Transformer (ViT) architecture with self-attention mechanisms for image processing

Small Image Processing

Optimized for 224x224 pixel images

Model Capabilities

Image Classification

Object Recognition

Visual Feature Extraction

Use Cases

Computer Vision

CIFAR-10 Image Classification

Classify 10 object categories in the CIFAR-10 dataset

85.72% accuracy

General Object Recognition

Identify common objects such as airplanes, cars, birds, etc.

🚀 vit-base-patch16-224-in21k-finetuned-cifar10_album_vitVMMRdb_make_model_album_pred

This model is a fine - tuned version of aaraki/vit-base-patch16-224-in21k-finetuned-cifar10, achieving high performance on evaluation metrics.

🚀 Quick Start

This model is a fine-tuned version of aaraki/vit-base-patch16-224-in21k-finetuned-cifar10 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.5462
Accuracy: 0.8594
Precision: 0.8556
Recall: 0.8594
F1: 0.8544

📚 Documentation

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 15

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Precision	Recall	F1
4.6112	1.0	839	4.5615	0.1425	0.0837	0.1425	0.0646
3.1177	2.0	1678	2.9595	0.4240	0.3424	0.4240	0.3283
2.0793	3.0	2517	2.0048	0.5771	0.5081	0.5771	0.5029
1.4566	4.0	3356	1.4554	0.6760	0.6333	0.6760	0.6280
1.1307	5.0	4195	1.1319	0.7350	0.7027	0.7350	0.7013
0.9367	6.0	5034	0.9328	0.7738	0.7546	0.7738	0.7503
0.7783	7.0	5873	0.8024	0.7986	0.7893	0.7986	0.7819
0.6022	8.0	6712	0.7187	0.8174	0.8098	0.8174	0.8055
0.5234	9.0	7551	0.6635	0.8313	0.8220	0.8313	0.8217
0.4298	10.0	8390	0.6182	0.8388	0.8337	0.8388	0.8302
0.3618	11.0	9229	0.5953	0.8455	0.8394	0.8455	0.8382
0.3262	12.0	10068	0.5735	0.8501	0.8443	0.8501	0.8436
0.3116	13.0	10907	0.5612	0.8527	0.8488	0.8527	0.8471
0.2416	14.0	11746	0.5524	0.8558	0.8500	0.8558	0.8496
0.2306	15.0	12585	0.5489	0.8572	0.8525	0.8572	0.8519

Framework versions

Transformers 4.24.0
Pytorch 1.12.1+cu113
Datasets 2.7.1
Tokenizers 0.13.2

📄 License

This project is licensed under the Apache-2.0 license.

Property	Details
Model Type	Fine - tuned version of a pre - trained model
Training Data	More information needed
Metrics	accuracy, precision, recall, f1

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご