dog-food-vit-base-patch16-224-in21k Open-source Image Classification Model - Accurately Distinguish Dog and Food Images

Dog Food Vit Base Patch16 224 In21k

Developed by sasha

This is an image classification model based on the Vision Transformer (ViT) architecture, specifically designed to distinguish between images of dogs and food.

Image Classification

Transformers

#High-precision image classification #Pet and food recognition #ViT Vision Transformer

Downloads 32

Release Time : 6/20/2022

Model Overview

The model is trained on a dataset of dogs and food, capable of distinguishing between images of dogs and food with high accuracy. Suitable for applications requiring automatic classification of these two types of images.

Model Features

High accuracy

Achieves 99.78% accuracy on the test set, demonstrating excellent performance.

Based on ViT architecture

Utilizes the Vision Transformer architecture with the patch16-224-in21k pre-trained model.

Simple and easy to use

Can be easily trained and used via HuggingPics.

Model Capabilities

Image classification

Distinguishing between dogs and food

Use Cases

Image classification

Pet and food recognition

Automatically identifies whether an image contains a dog or food

Accuracy as high as 99.78%

Content filtering

Used to filter or classify content containing dogs or food

Property	Details
Model Name	dog-food-vit-base-patch16-224-in21k
Task	Image Classification
Dataset	Dog Food (sasha/dog-food)
Metrics
	- Accuracy: 0.9988889098167419 (train split)
	- Accuracy: 0.9977777777777778 (test split, verified)
	- Precision: 0.9966777408637874 (test split, verified)
	- Recall: 1.0 (test split, verified)
	- AUC: 0.9999777777777779 (test split, verified)
	- F1: 0.9983361064891847 (test split, verified)
	- Loss: 0.009058385156095028 (test split, verified)