Vit Base Patch32 224 In21k Finetuned Eurosat
V
Vit Base Patch32 224 In21k Finetuned Eurosat
Developed by sshreshtha
An image classification model based on Google's Vision Transformer (ViT) architecture, fine-tuned on the food101 dataset for food image classification tasks
Downloads 30
Release Time : 11/24/2022
Model Overview
This model is a pre-trained model based on the Vision Transformer architecture, fine-tuned on the food101 food classification dataset, capable of classifying and recognizing images of 101 different food categories
Model Features
Based on Vision Transformer Architecture
Utilizes advanced Transformer architecture for visual tasks with powerful feature extraction capabilities
Food Image Classification
Specialized classification model optimized for 101 categories of food images
High Accuracy
Achieves 73.21% classification accuracy on the food101 test set
Model Capabilities
Food Image Classification
Visual Feature Extraction
Multi-category Image Recognition
Use Cases
Food Recognition
Restaurant Dish Recognition
Used in restaurants to automatically identify dish images for intelligent menu management
Can accurately recognize 101 common food categories
Healthy Diet Applications
Integrated into mobile apps to help users identify food and track dietary intake
Featured Recommended AI Models
Š 2025AIbase