đ finetuned-clothes
This model is a fine - tuned version of [google/vit - base - patch16 - 224 - in21k](https://huggingface.co/google/vit - base - patch16 - 224 - in21k) for clothes image classification, achieving high accuracy on the evaluation set.
đ Quick Start
This model is a fine - tuned version of google/vit-base-patch16-224-in21k on the clothes_simplifiedv2 dataset. It achieves the following results on the evaluation set:
- Loss: 0.2225
- Accuracy: 0.9417
⨠Features
This model classifies clothes category based on the given image.
đģ Usage Examples
Basic Usage
from PIL import Image
import requests
url = 'insert image url here'
image = Image.open(requests.get(url, stream=True).raw)
from transformers import AutoModelForImageClassification, AutoImageProcessor
repo_name = "samokosik/finetuned-clothes"
image_processor = AutoImageProcessor.from_pretrained(repo_name)
model = AutoModelForImageClassification.from_pretrained(repo_name)
encoding = image_processor(image.convert("RGB"), return_tensors="pt")
print(encoding.pixel_values.shape)
import torch
with torch.no_grad():
outputs = model(**encoding)
logits = outputs.logits
predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])
đ§ Technical Details
Limitations
â ī¸ Important Note
Due to lack of available data, we support only these categories: hat, longsleeve, outswear, pants, shoes, shorts, shortsleve.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e - 05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
- lr_scheduler_type: linear
- num_epochs: 4
- mixed_precision_training: Native AMP
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Accuracy |
0.7725 |
0.2058 |
100 |
0.7008 |
0.8178 |
0.5535 |
0.4115 |
200 |
0.4494 |
0.8994 |
0.4334 |
0.6173 |
300 |
0.3649 |
0.9169 |
0.3921 |
0.8230 |
400 |
0.3085 |
0.9184 |
0.3695 |
1.0288 |
500 |
0.3091 |
0.9184 |
0.2634 |
1.2346 |
600 |
0.3339 |
0.9082 |
0.4788 |
1.4403 |
700 |
0.2827 |
0.9257 |
0.3337 |
1.6461 |
800 |
0.2499 |
0.9344 |
0.34 |
1.8519 |
900 |
0.2586 |
0.9315 |
0.2424 |
2.0576 |
1000 |
0.2248 |
0.9402 |
0.1559 |
2.2634 |
1100 |
0.2333 |
0.9344 |
0.351 |
2.4691 |
1200 |
0.2495 |
0.9359 |
0.2206 |
2.6749 |
1300 |
0.2622 |
0.9242 |
0.3814 |
2.8807 |
1400 |
0.3138 |
0.9155 |
0.2141 |
3.0864 |
1500 |
0.2613 |
0.9315 |
0.112 |
3.2922 |
1600 |
0.2266 |
0.9402 |
0.0631 |
3.4979 |
1700 |
0.2255 |
0.9402 |
0.1986 |
3.7037 |
1800 |
0.2225 |
0.9417 |
0.2345 |
3.9095 |
1900 |
0.2235 |
0.9373 |
Framework versions
- Transformers 4.40.1
- Pytorch 2.2.1+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1
đ Documentation
This model was trained on the following dataset: https://huggingface.co/datasets/samokosik/clothes_simplifiedv2
đ License
This project is licensed under the Apache - 2.0 license.