Swin-Aragpt2 Image Captioning v3 Open-source Image Captioning Model - Freely Generate Text Descriptions for Images

Swin Aragpt2 Image Captioning V3

Developed by AsmaMassad

An image captioning model based on Swin Transformer and AraGPT2 architecture, capable of generating textual descriptions for input images.

Image-to-Text

Transformers

#Image Captioning #Multimodal Model #Low-Resource Optimization

Downloads 18

Release Time : 6/6/2023

Model Overview

This model is a vision-language model that combines the image encoding capability of Swin Transformer with the text generation capability of AraGPT2, specifically designed for image captioning tasks.

Model Features

Multimodal Architecture

Combines vision Transformer and language model to achieve image-to-text conversion

End-to-End Training

The entire model is fine-tuned end-to-end to optimize joint capabilities of image understanding and text generation

Cross-Modal Understanding

Capable of understanding image content and generating coherent descriptive text

Model Capabilities

Image Content Understanding

Arabic Text Generation

Image-to-Text Conversion

Use Cases

Assistive Technology

Visual Impairment Assistance

Generates image descriptions for visually impaired users

Content Generation

Social Media Content Auto-Generation

Automatically generates descriptive text for uploaded images

Training Loss	Epoch	Step	Validation Loss	Meteor	Bleu1	Bleu2	Bleu3	Bleu4
1.5775	4.71	5000	1.2386	1.91	2.6908	1.0804	0.3964	0.1282
1.2446	9.42	10000	1.1985	5.09	8.4549	2.9556	1.2756	0.4817
1.1919	14.12	15000	1.1792	5.4	9.0722	2.9343	1.1887	0.4748
1.1669	18.83	20000	1.1743	5.02	8.5611	2.9273	1.1796	0.4618

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Swin Aragpt2 Image Captioning V3

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 swin-aragpt2-image-captioning-v3

🚀 Quick Start

📚 Documentation

Model description

Intended uses & limitations

Training and evaluation data

🔧 Technical Details

Training procedure

Training hyperparameters

Training results

Framework versions