Vit-base-patch16-224-distilgpt2 Open-source Image Description Model - Free Image to Text Description Conversion

Vit Base Patch16 224 Distilgpt2

Developed by tarekziade

DistilViT is an image caption generation model based on Vision Transformer (ViT) and distilled GPT-2, capable of converting images into textual descriptions.

Image-to-Text

Transformers

Open Source License:Apache-2.0 #Image Caption Generation #Distilled Model #Multi-dataset Fine-tuning

Downloads 17

Release Time : 6/19/2024

Model Overview

This model combines the image encoding capability of Vision Transformer with the text generation ability of distilled GPT-2, specifically designed for image-to-text tasks to generate descriptive text for images.

Model Features

Efficient Image Understanding

Uses VIT model as the image encoder to effectively understand image content

Lightweight Text Generation

Employs distilled GPT-2 as the text decoder to reduce model size while maintaining performance

Multi-dataset Training

Trained on multiple datasets including Flickr30k and COCO 2017 to enhance generalization capability

Model Capabilities

Image Content Understanding

Image Caption Generation

Vision-Language Conversion

Use Cases

Assistive Technology

Generating Image Descriptions for the Visually Impaired

Automatically generates textual descriptions for images to help visually impaired individuals understand image content

Content Management

Automatic Image Tagging

Automatically generates descriptive tags for large volumes of images to facilitate search and management

Property	Details
Model Name	mozilla/distilvit
Task Type	image - to - text
Dataset Name	Mozilla/flickr30k - transformed - captions
ROUGE - 1	43.006
ROUGE - 2	16.9939
ROUGE - L	38.8923
ROUGE - LSUM	38.8877
Loss	0.19939416646957397
Gen Len	11.327256736227712

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Vit Base Patch16 224 Distilgpt2

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 distilvit

🚀 Quick Start

📦 Installation

💻 Usage Examples

📚 Documentation

Training Data

Model Code

Model Index

📄 License