vit-artworkclassifier Open-source Model - Free Deployment, Identify Image Art Style Categories!

Home

Vit Artworkclassifier

Developed by oschamp

Art style classification model based on ViT architecture, capable of identifying the art style category of input images

Image Classification

Transformers

Open Source License:Apache-2.0 #Art Style Classification #ViT Fine-tuned Model #Image Classification

Downloads 41

Release Time : 2/21/2023

Model Overview

This model is an art style classifier fine-tuned on the artbench-10 dataset using Google's ViT-base-patch16-224-in21k, capable of recognizing 9 different art styles

Model Features

Art Style Recognition

Accurately identifies the art style category of input images

Efficient Fine-tuning

Achieves good results with limited data through efficient fine-tuning based on a pre-trained ViT model

Multi-category Classification

Supports classification and recognition of 9 different art styles

Model Capabilities

Image Classification

Art Style Recognition

Visual Feature Extraction

Use Cases

Art Analysis

Artwork Classification

Automatically classifies the styles of works in digital art collections

Accuracy reaches 59.48%

Art Education

Assists art learners in identifying works of different art styles

🚀 vit-artworkclassifier

This model is designed to identify the artwork style of any input image, offering a practical solution for image classification tasks in the field of art. It is a fine - tuned version of [google/vit - base - patch16 - 224 - in21k](https://huggingface.co/google/vit - base - patch16 - 224 - in21k) on the imagefolder dataset, a subset of the artbench - 10 dataset.

🚀 Quick Start

This model returns the artwork style of any image input. It's a fine - tuned version of [google/vit - base - patch16 - 224 - in21k](https://huggingface.co/google/vit - base - patch16 - 224 - in21k) on the imagefolder dataset. This subset of the artbench - 10 dataset (https://www.kaggle.com/datasets/alexanderliao/artbench10) has a train set of 1000 artworks per class and a validation set of 100 artworks per class. It achieves the following results on the evaluation set:

Loss: 1.1392
Accuracy: 0.5948

✨ Features

Artwork Style Classification: Accurately predicts the artwork style of input images.
Fine - Tuned Model: Based on a well - known base model and fine - tuned on a specific dataset for better performance.

📚 Documentation

Model description

You can find a description of the project that this model was trained for here: https://medium.com/@oliverpj.schamp/training - and - evaluating - stable - diffusion - for - artwork - generation - b099d1f5b7a6

Intended uses & limitations

This model only contains 9 out of the 10 artbench - 10 classes - it does not contain ukiyo_e. This was due to availability and formatting issues.

Training and evaluation data

Train: 1000 randomly selected images from artbench - 10 (per class). Val: 100 randomly selected images from artbench - 10 (per class).

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
num_epochs: 4
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.5906	0.36	100	1.4709	0.4847
1.3395	0.72	200	1.3208	0.5074
1.1461	1.08	300	1.3363	0.5165
0.9593	1.44	400	1.1790	0.5846
0.8761	1.8	500	1.1252	0.5902
0.5922	2.16	600	1.1392	0.5948
0.4803	2.52	700	1.1560	0.5936
0.4454	2.88	800	1.1545	0.6118
0.2271	3.24	900	1.2284	0.6039
0.207	3.6	1000	1.2625	0.5959
0.1958	3.96	1100	1.2621	0.6005

Framework versions

Transformers 4.26.1
Pytorch 1.13.1+cu117
Datasets 2.9.0
Tokenizers 0.13.2

💻 Usage Examples

Basic Usage

def vit_classify(image):
    vit = ViTForImageClassification.from_pretrained("oschamp/vit-artworkclassifier")
    vit.eval()
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    vit.to(device)
    
    model_name_or_path = 'google/vit-base-patch16-224-in21k'
    feature_extractor = ViTFeatureExtractor.from_pretrained(model_name_or_path)

    #LOAD IMAGE

    encoding = feature_extractor(images=image, return_tensors="pt")
    encoding.keys()

    pixel_values = encoding['pixel_values'].to(device)

    outputs = vit(pixel_values)
    logits = outputs.logits

    prediction = logits.argmax(-1)
    return prediction.item() #vit.config.id2label[prediction.item()]

🔧 Technical Details

The model is fine - tuned on the imagefolder dataset, which is a subset of the artbench - 10 dataset. The fine - tuning process involves adjusting the hyperparameters to optimize the model's performance on the specific task of artwork style classification. The base model [google/vit - base - patch16 - 224 - in21k](https://huggingface.co/google/vit - base - patch16 - 224 - in21k) provides a solid foundation, and the fine - tuning helps it adapt to the characteristics of the artbench - 10 dataset.

📄 License

This model is licensed under the Apache - 2.0 license.

📦 Model Information

Property	Details
Model Type	Fine - tuned version of google/vit - base - patch16 - 224 - in21k for image classification
Training Data	Subset of artbench - 10 dataset (imagefolder), with 1000 training images and 100 validation images per class
Metrics	Accuracy: 0.5948 on the evaluation set
Base Model	google/vit - base - patch16 - 224 - in21k

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご