V

Vit Finetuned Food101

Developed by ashaduzzaman
This is a Vision Transformer model fine-tuned on the Food-101 dataset for food image classification tasks.
Downloads 162
Release Time : 8/28/2024

Model Overview

Based on Google's ViT architecture, this model is specifically optimized for 101 food categories, suitable for scenarios like diet tracking and restaurant menu analysis.

Model Features

High-Accuracy Food Classification
Achieves 89.6% accuracy on the Food-101 test set, capable of accurately identifying 101 different food categories.
ViT-Based Architecture
Utilizes the Vision Transformer architecture with self-attention mechanisms to capture global image features.
Transfer Learning Optimization
Fine-tuned from a pre-trained ViT model, effectively leveraging features learned from large-scale image data.

Model Capabilities

Food Image Classification
Multi-category Recognition
Diet Analysis

Use Cases

Diet & Health
Automatic Food Logging
Helps users automatically record dietary content by taking photos
Accurately identifies 101 common food items
Food & Beverage Industry
Menu Analysis
Automatically analyzes food categories in restaurant menus
Featured Recommended AI Models
ยฉ 2025AIbase