đ Dogs-Breed-Image-Classification-V2
This model is a fine - tuned version of microsoft/resnet - 152 for dog breed image classification, achieving high accuracy on the Standford dogs dataset.
đ Quick Start
This model is a fine - tuned version of microsoft/resnet-152 on the Standford dogs dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0115
- Accuracy: 0.8408
⨠Features
- Based on the well - known microsoft/resnet-152 architecture.
- Fine - tuned on the Standford dogs dataset for high - accuracy dog breed classification.
đ Documentation
Model description
This model was trained using dataset from Kaggle - Standford dogs dataset
Quotes from the website:
The Stanford Dogs dataset contains images of 120 breeds of dogs from around the world. This dataset has been built using images and annotation from ImageNet for the task of fine - grained image categorization. It was originally collected for fine - grain image categorization, a challenging problem as certain dog breeds have near identical features or differ in colour and age.
citation:
Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao and Li Fei - Fei. Novel dataset for Fine - Grained Image Categorization. First Workshop on Fine - Grained Visual Categorization (FGVC), IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011. [pdf] [poster] [BibTex]
Secondary:
J. Deng, W. Dong, R. Socher, L. - J. Li, K. Li and L. Fei - Fei, ImageNet: A Large - Scale Hierarchical Image Database. IEEE Computer Vision and Pattern Recognition (CVPR), 2009. [pdf] [BibTex]
Intended uses & limitations
This model is fine - tuned solely for classifying 120 species of dogs.
Training and evaluation data
75% training data, 25% testing data.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e - 06
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Accuracy |
No log |
1.0 |
483 |
4.6525 |
0.7382 |
4.7329 |
2.0 |
966 |
4.3558 |
0.7298 |
4.5033 |
3.0 |
1449 |
3.9568 |
0.7471 |
4.1405 |
4.0 |
1932 |
3.5160 |
0.7782 |
3.7176 |
5.0 |
2415 |
3.0805 |
0.7946 |
3.293 |
6.0 |
2898 |
2.6907 |
0.8021 |
2.8898 |
7.0 |
3381 |
2.3044 |
0.8126 |
2.5343 |
8.0 |
3864 |
2.0091 |
0.8177 |
2.2188 |
9.0 |
4347 |
1.7910 |
0.8126 |
1.9698 |
10.0 |
4830 |
1.6015 |
0.8194 |
1.7532 |
11.0 |
5313 |
1.4383 |
0.8220 |
1.586 |
12.0 |
5796 |
1.3355 |
0.8264 |
1.4533 |
13.0 |
6279 |
1.2467 |
0.8260 |
1.336 |
14.0 |
6762 |
1.1575 |
0.8313 |
1.2641 |
15.0 |
7245 |
1.1038 |
0.8321 |
1.185 |
16.0 |
7728 |
1.0606 |
0.8395 |
1.1329 |
17.0 |
8211 |
1.0178 |
0.8398 |
1.0977 |
18.0 |
8694 |
1.0115 |
0.8408 |
1.0732 |
19.0 |
9177 |
0.9945 |
0.8381 |
1.0508 |
20.0 |
9660 |
0.9930 |
0.8393 |
Framework versions
- Transformers 4.37.2
- Pytorch 2.3.0
- Datasets 2.15.0
- Tokenizers 0.15.1
đ License
This model is released under the Apache - 2.0 license.
đĻ Information Table
Property |
Details |
Model Type |
Fine - tuned version of microsoft/resnet - 152 for dog breed image classification |
Training Data |
Standford dogs dataset from Kaggle |
Metrics |
Accuracy: 0.8408 |