H

Hindi Image Captioning

Developed by team-indain-image-caption
This is an encoder-decoder image captioning model built on a VIT encoder and GPT2-Hindi decoder, specifically designed for generating Hindi image descriptions.
Downloads 35
Release Time : 3/2/2022

Model Overview

The model combines a vision encoder (ViT) and a language decoder (GPT2-Hindi) to generate Hindi descriptive text for input images. This is the first attempt to use the ViT+GPT2-Hindi combination for image captioning tasks.

Model Features

Hindi Image Captioning
Image captioning capability specifically optimized for Hindi
ViT+GPT2 Combination
First attempt using a combination of ViT vision encoder and GPT2-Hindi language decoder architecture
Community-driven Development
Collaboratively completed by community members during HuggingFace Community Course Week

Model Capabilities

Image Understanding
Hindi Text Generation
Image-to-Text Conversion

Use Cases

Assistive Technology
Visual Assistance
Providing Hindi image descriptions for visually impaired individuals
Content Generation
Social Media Content
Automatically generating Hindi descriptions for social media images
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase