V

Vit Base Patch32 224 In21k Finetuned Eurosat

Developed by sshreshtha
An image classification model based on Google's Vision Transformer (ViT) architecture, fine-tuned on the food101 dataset for food image classification tasks
Downloads 30
Release Time : 11/24/2022

Model Overview

This model is a pre-trained model based on the Vision Transformer architecture, fine-tuned on the food101 food classification dataset, capable of classifying and recognizing images of 101 different food categories

Model Features

Based on Vision Transformer Architecture
Utilizes advanced Transformer architecture for visual tasks with powerful feature extraction capabilities
Food Image Classification
Specialized classification model optimized for 101 categories of food images
High Accuracy
Achieves 73.21% classification accuracy on the food101 test set

Model Capabilities

Food Image Classification
Visual Feature Extraction
Multi-category Image Recognition

Use Cases

Food Recognition
Restaurant Dish Recognition
Used in restaurants to automatically identify dish images for intelligent menu management
Can accurately recognize 101 common food categories
Healthy Diet Applications
Integrated into mobile apps to help users identify food and track dietary intake
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase