Dit Base Finetuned Rvlcdip Finetuned Data200
This model is a fine-tuned version of microsoft/dit-base-finetuned-rvlcdip on an image folder dataset, primarily used for image classification tasks.
Downloads 16
Release Time : 2/27/2023
Model Overview
This is a fine-tuned image classification model based on the DiT (Document Image Transformer) architecture, optimized for document image classification tasks.
Model Features
Document Image Optimization
Specifically optimized for document image classification tasks
Transfer Learning
Fine-tuned based on a pre-trained DiT model
200 Epochs Training
Trained for 200 epochs to achieve relatively stable performance
Model Capabilities
Document Image Classification
Image Feature Extraction
Use Cases
Document Processing
Document Type Recognition
Automatically identify different types of documents (e.g., invoices, contracts, forms, etc.)
Achieved 56.99% accuracy on the evaluation set
🚀 dit-base-finetuned-rvlcdip-finetuned-data200
This model is a fine - tuned version of microsoft/dit-base-finetuned-rvlcdip on the imagefolder dataset. It can be used for image classification tasks and achieves the following results on the evaluation set:
- Loss: 3.0080
- Accuracy: 0.5699
✨ Features
- Based on the pre - trained model [microsoft/dit-base-finetuned-rvlcdip], it is fine - tuned on the imagefolder dataset, which can better adapt to specific image classification tasks.
- The model provides certain accuracy in image classification, with an accuracy rate of 0.5699 on the evaluation set.
📚 Documentation
Model description
This model is a fine - tuned version of [microsoft/dit-base-finetuned-rvlcdip] on the imagefolder dataset.
Intended uses & limitations
The model is mainly intended for image classification tasks. However, more information about its limitations needs to be further explored.
Training and evaluation data
The model is trained and evaluated on the imagefolder dataset. But more detailed information about the data needs to be supplemented.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e - 05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 200
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
2.1142 | 1.0 | 46 | 2.0131 | 0.3441 |
1.9953 | 2.0 | 92 | 1.9577 | 0.4086 |
1.9558 | 3.0 | 138 | 1.9231 | 0.4301 |
1.9251 | 4.0 | 184 | 1.8015 | 0.4946 |
1.6485 | 5.0 | 230 | 1.7045 | 0.5269 |
1.5973 | 6.0 | 276 | 1.5806 | 0.5054 |
1.4755 | 7.0 | 322 | 1.4849 | 0.5054 |
1.4537 | 8.0 | 368 | 1.4356 | 0.5161 |
1.416 | 9.0 | 414 | 1.4512 | 0.5269 |
1.3645 | 10.0 | 460 | 1.3857 | 0.5591 |
1.3017 | 11.0 | 506 | 1.3108 | 0.5484 |
1.2794 | 12.0 | 552 | 1.3027 | 0.5376 |
1.1553 | 13.0 | 598 | 1.2883 | 0.5484 |
1.1526 | 14.0 | 644 | 1.3554 | 0.5054 |
1.1116 | 15.0 | 690 | 1.3235 | 0.5914 |
1.1925 | 16.0 | 736 | 1.2401 | 0.5806 |
1.1297 | 17.0 | 782 | 1.3425 | 0.5914 |
0.9717 | 18.0 | 828 | 1.3538 | 0.5484 |
0.8404 | 19.0 | 874 | 1.2648 | 0.5699 |
0.7008 | 20.0 | 920 | 1.4971 | 0.5376 |
1.1454 | 21.0 | 966 | 1.4137 | 0.4839 |
0.6849 | 22.0 | 1012 | 1.2801 | 0.5591 |
0.8566 | 23.0 | 1058 | 1.2380 | 0.5699 |
0.8956 | 24.0 | 1104 | 1.2903 | 0.6129 |
0.8004 | 25.0 | 1150 | 1.4372 | 0.5591 |
0.818 | 26.0 | 1196 | 1.1640 | 0.6344 |
0.6387 | 27.0 | 1242 | 1.3120 | 0.6452 |
0.7282 | 28.0 | 1288 | 1.4678 | 0.5161 |
0.7426 | 29.0 | 1334 | 1.4815 | 0.5269 |
0.735 | 30.0 | 1380 | 1.2714 | 0.6129 |
0.6769 | 31.0 | 1426 | 1.2262 | 0.5699 |
0.5562 | 32.0 | 1472 | 1.3348 | 0.6344 |
0.6671 | 33.0 | 1518 | 1.4159 | 0.6129 |
0.3708 | 34.0 | 1564 | 1.6416 | 0.5484 |
0.3967 | 35.0 | 1610 | 1.3298 | 0.5699 |
0.4692 | 36.0 | 1656 | 1.3559 | 0.5699 |
0.632 | 37.0 | 1702 | 1.3349 | 0.5699 |
0.3719 | 38.0 | 1748 | 1.4697 | 0.5914 |
0.4238 | 39.0 | 1794 | 1.5207 | 0.6022 |
0.3608 | 40.0 | 1840 | 1.5557 | 0.5591 |
0.6252 | 41.0 | 1886 | 1.6247 | 0.5269 |
0.4183 | 42.0 | 1932 | 1.5885 | 0.5914 |
0.3922 | 43.0 | 1978 | 1.6593 | 0.5699 |
0.5715 | 44.0 | 2024 | 1.5270 | 0.5699 |
0.3656 | 45.0 | 2070 | 1.8899 | 0.5054 |
0.3656 | 46.0 | 2116 | 2.0936 | 0.4624 |
0.4003 | 47.0 | 2162 | 1.5610 | 0.5054 |
0.446 | 48.0 | 2208 | 1.7388 | 0.5376 |
0.5219 | 49.0 | 2254 | 1.4976 | 0.6129 |
0.3488 | 50.0 | 2300 | 1.5744 | 0.5914 |
0.323 | 51.0 | 2346 | 1.6312 | 0.6022 |
0.3713 | 52.0 | 2392 | 1.6975 | 0.5591 |
0.2981 | 53.0 | 2438 | 1.6229 | 0.5699 |
0.3422 | 54.0 | 2484 | 2.0909 | 0.4624 |
0.2538 | 55.0 | 2530 | 2.0966 | 0.5161 |
0.3868 | 56.0 | 2576 | 1.5614 | 0.6344 |
0.4662 | 57.0 | 2622 | 1.8929 | 0.5269 |
0.4277 | 58.0 | 2668 | 1.9573 | 0.5376 |
0.5301 | 59.0 | 2714 | 1.7999 | 0.5699 |
0.3867 | 60.0 | 2760 | 2.3481 | 0.4624 |
0.2334 | 61.0 | 2806 | 1.9924 | 0.5376 |
0.2921 | 62.0 | 2852 | 2.0454 | 0.5591 |
0.4386 | 63.0 | 2898 | 1.7798 | 0.5376 |
0.3299 | 64.0 | 2944 | 1.9370 | 0.5914 |
0.5982 | 65.0 | 2990 | 2.0527 | 0.5591 |
0.4433 | 66.0 | 3036 | 1.6222 | 0.6237 |
0.3717 | 67.0 | 3082 | 1.7977 | 0.5914 |
0.3642 | 68.0 | 3128 | 1.6988 | 0.5914 |
0.4541 | 69.0 | 3174 | 1.7567 | 0.6022 |
0.3464 | 70.0 | 3220 | 1.9029 | 0.5699 |
0.2764 | 71.0 | 3266 | 1.9611 | 0.6022 |
0.2138 | 72.0 | 3312 | 1.9333 | 0.5591 |
0.3928 | 73.0 | 3358 | 1.7701 | 0.5806 |
0.1811 | 74.0 | 3404 | 1.8330 | 0.5806 |
0.2076 | 75.0 | 3450 | 1.6676 | 0.6559 |
0.3326 | 76.0 | 3496 | 2.0036 | 0.6022 |
0.1343 | 77.0 | 3542 | 1.6937 | 0.6344 |
0.3031 | 78.0 | 3588 | 1.9223 | 0.6237 |
0.2743 | 79.0 | 3634 | 2.1681 | 0.5699 |
0.3392 | 80.0 | 3680 | 2.0505 | 0.6129 |
0.1346 | 81.0 | 3726 | 2.0190 | 0.5699 |
0.0652 | 82.0 | 3772 | 2.2910 | 0.5699 |
0.4219 | 83.0 | 3818 | 1.8858 | 0.5914 |
0.1386 | 84.0 | 3864 | 1.7976 | 0.6237 |
0.2155 | 85.0 | 3910 | 2.4278 | 0.5161 |
0.4901 | 86.0 | 3956 | 1.9239 | 0.6237 |
0.3141 | 87.0 | 4002 | 2.0954 | 0.6559 |
0.2328 | 88.0 | 4048 | 2.2602 | 0.5806 |
0.2768 | 89.0 | 4094 | 2.1083 | 0.5914 |
0.3476 | 90.0 | 4140 | 2.4922 | 0.5269 |
0.2029 | 91.0 | 4186 | 2.2094 | 0.5591 |
0.2421 | 92.0 | 4232 | 2.2407 | 0.5376 |
0.2034 | 93.0 | 4278 | 2.1488 | 0.5591 |
0.2461 | 94.0 | 4324 | 2.1332 | 0.5806 |
0.1462 | 95.0 | 4370 | 2.2702 | 0.5591 |
0.5213 | 96.0 | 4416 | 2.2134 | 0.5699 |
0.3634 | 97.0 | 4462 | 2.1066 | 0.5699 |
0.1698 | 98.0 | 4508 | 2.2736 | 0.6237 |
0.1685 | 99.0 | 4554 | 2.3919 | 0.5806 |
0.1971 | 100.0 | 4600 | 2.0664 | 0.6237 |
0.1496 | 101.0 | 4646 | 2.5661 | 0.5806 |
0.283 | 102.0 | 4692 | 2.0714 | 0.5699 |
0.185 | 103.0 | 4738 | 2.1369 | 0.6022 |
0.1489 | 104.0 | 4784 | 2.1653 | 0.6129 |
0.1231 | 105.0 | 4830 | 2.0890 | 0.6452 |
0.3224 | 106.0 | 4876 | 2.3771 | 0.5376 |
0.3452 | 107.0 | 4922 | 2.2537 | 0.6344 |
0.4404 | 108.0 | 4968 | 2.0253 | 0.6129 |
0.3408 | 109.0 | 5014 | 2.1653 | 0.5699 |
0.2406 | 110.0 | 5060 | 2.0196 | 0.6237 |
0.3051 | 111.0 | 5106 | 2.1980 | 0.6129 |
0.1515 | 112.0 | 5152 | 2.4104 | 0.5699 |
0.3836 | 113.0 | 5198 | 2.2342 | 0.6344 |
0.3572 | 114.0 | 5244 | 2.2321 | 0.6022 |
0.3006 | 115.0 | 5290 | 2.3555 | 0.5806 |
0.0965 | 116.0 | 5336 | 2.7237 | 0.4516 |
0.2023 | 117.0 | 5382 | 2.3798 | 0.6237 |
0.1272 | 118.0 | 5428 | 2.5357 | 0.5591 |
0.4318 | 119.0 | 5474 | 2.4913 | 0.5699 |
0.0414 | 120.0 | 5520 | 2.3760 | 0.6022 |
0.1785 | 121.0 | 5566 | 2.3920 | 0.6129 |
0.0142 | 122.0 | 5612 | 2.4256 | 0.6022 |
0.1262 | 123.0 | 5658 | 2.7212 | 0.5806 |
0.2219 | 124.0 | 5704 | 2.3683 | 0.5699 |
0.1629 | 125.0 | 5750 | 2.4280 | 0.5484 |
0.149 | 126.0 | 5796 | 3.0708 | 0.4839 |
0.2394 | 127.0 | 5842 | 2.2192 | 0.6022 |
0.2165 | 128.0 | 5888 | 2.4015 | 0.5806 |
0.0729 | 129.0 | 5934 | 2.2241 | 0.6022 |
0.2585 | 130.0 | 5980 | 2.9483 | 0.5054 |
0.1401 | 131.0 | 6026 | 2.3180 | 0.6129 |
0.4162 | 132.0 | 6072 | 3.0147 | 0.4946 |
0.1188 | 133.0 | 6118 | 2.3128 | 0.6237 |
0.0939 | 134.0 | 6164 | 2.5300 | 0.6022 |
0.1039 | 135.0 | 6210 | 2.5740 | 0.5699 |
0.3678 | 136.0 | 6256 | 2.5887 | 0.5914 |
0.3998 | 137.0 | 6302 | 2.5664 | 0.5376 |
0.1952 | 138.0 | 6348 | 2.1861 | 0.6774 |
0.2616 | 139.0 | 6394 | 2.7036 | 0.5806 |
0.2523 | 140.0 | 6440 | 2.5953 | 0.5806 |
0.2772 | 141.0 | 6486 | 2.4114 | 0.6129 |
0.2399 | 142.0 | 6532 | 2.3203 | 0.6237 |
0.3769 | 143.0 | 6578 | 2.7200 | 0.5591 |
0.0094 | 144.0 | 6624 | 2.7315 | 0.5591 |
0.1818 | 145.0 | 6670 | 2.5223 | 0.6129 |
0.3063 | 146.0 | 6716 | 2.3310 | 0.6237 |
0.222 | 147.0 | 6762 | 2.6180 | 0.5806 |
0.2505 | 148.0 | 6808 | 2.2976 | 0.6344 |
0.2705 | 149.0 | 6854 | 2.4091 | 0.5914 |
0.1624 | 150.0 | 6900 | 2.8030 | 0.5269 |
0.1322 | 151.0 | 6946 | 2.6379 | 0.5591 |
0.0876 | 152.0 | 6992 | 2.5781 | 0.5484 |
0.1332 | 153.0 | 7038 | 2.8476 | 0.5591 |
0.2727 | 154.0 | 7084 | 2.6779 | 0.5699 |
0.195 | 155.0 | 7130 | 3.0504 | 0.4839 |
0.152 | 156.0 | 7176 | 2.6103 | 0.5806 |
0.2811 | 157.0 | 7222 | 2.5947 | 0.6129 |
0.0742 | 158.0 | 7268 | 2.4666 | 0.6559 |
0.2052 | 159.0 | 7314 | 2.5116 | 0.5484 |
0.2598 | 160.0 | 7360 | 3.0400 | 0.5269 |
0.2846 | 161.0 | 7406 | 2.2042 | 0.6667 |
0.2653 | 162.0 | 7452 | 3.0598 | 0.5484 |
0.358 | 163.0 | 7498 | 2.7669 | 0.5806 |
0.0355 | 164.0 | 7544 | 2.4568 | 0.6237 |
0.1817 | 165.0 | 7590 | 2.9532 | 0.5806 |
0.0955 | 166.0 | 7636 | 2.4798 | 0.6237 |
0.1941 | 167.0 | 7682 | 2.7027 | 0.5699 |
0.1787 | 168.0 | 7728 | 2.4225 | 0.6237 |
0.0998 | 169.0 | 7774 | 2.5104 | 0.5914 |
0.0392 | 170.0 | 7820 | 2.6235 | 0.5806 |
0.2689 | 171.0 | 7866 | 2.9215 | 0.5806 |
0.0595 | 172.0 | 7912 | 2.8108 | 0.5699 |
0.148 | 173.0 | 7958 | 2.9213 | 0.5806 |
0.2159 | 174.0 | 8004 | 2.6172 | 0.6129 |
0.1221 | 175.0 | 8050 | 2.4386 | 0.6237 |
0.0691 | 176.0 | 8096 | 2.8642 | 0.5269 |
0.2014 | 177.0 | 8142 | 2.7364 | 0.6022 |
0.0379 | 178.0 | 8188 | 2.4859 | 0.6022 |
0.2202 | 179.0 | 8234 | 3.0665 | 0.5484 |
0.2078 | 180.0 | 8280 | 2.3521 | 0.6237 |
0.1051 | 181.0 | 8326 | 2.4827 | 0.6237 |
0.2257 | 182.0 | 8372 | 2.8155 | 0.5914 |
0.1339 | 183.0 | 8418 | 2.6274 | 0.6237 |
0.1414 | 184.0 | 8464 | 2.6779 | 0.5699 |
Nsfw Image Detection
Apache-2.0
An NSFW image classification model based on the ViT architecture, pre-trained on ImageNet-21k via supervised learning and fine-tuned on 80,000 images to distinguish between normal and NSFW content.
Image Classification
Transformers

N
Falconsai
82.4M
588
Fairface Age Image Detection
Apache-2.0
An image classification model based on Vision Transformer architecture, pre-trained on the ImageNet-21k dataset, suitable for multi-category image classification tasks
Image Classification
Transformers

F
dima806
76.6M
10
Dinov2 Small
Apache-2.0
A small-scale vision Transformer model trained using the DINOv2 method, extracting image features through self-supervised learning
Image Classification
Transformers

D
facebook
5.0M
31
Vit Base Patch16 224
Apache-2.0
Vision Transformer model pre-trained on ImageNet-21k and fine-tuned on ImageNet for image classification tasks
Image Classification
V
google
4.8M
775
Vit Base Patch16 224 In21k
Apache-2.0
A Vision Transformer model pretrained on the ImageNet-21k dataset for image classification tasks.
Image Classification
V
google
2.2M
323
Dinov2 Base
Apache-2.0
Vision Transformer model trained using the DINOv2 method, extracting image features through self-supervised learning
Image Classification
Transformers

D
facebook
1.9M
126
Gender Classification
An image classification model built with PyTorch and HuggingPics for recognizing gender in images
Image Classification
Transformers

G
rizvandwiki
1.8M
48
Vit Base Nsfw Detector
Apache-2.0
An image classification model based on Vision Transformer (ViT) architecture, specifically designed to detect whether images contain NSFW (Not Safe For Work) content.
Image Classification
Transformers

V
AdamCodd
1.2M
47
Vit Hybrid Base Bit 384
Apache-2.0
The Hybrid Vision Transformer (ViT) model combines convolutional networks and Transformer architectures for image classification tasks, excelling on ImageNet.
Image Classification
Transformers

V
google
992.28k
6
Gender Classification 2
This is an image classification model based on the PyTorch framework and generated using HuggingPics tools, specifically designed for gender classification tasks.
Image Classification
Transformers

G
rizvandwiki
906.98k
32
Featured Recommended AI Models