C

Cogflorence 2.1 Large

Developed by thwri
This model is a fine-tuned version of microsoft/Florence-2-large, trained on a subset of 40,000 images from the Ejafa/ye-pop dataset, with annotations generated by THUDM/cogvlm2-llama3-chat-19B, focusing on image-to-text tasks.
Downloads 2,541
Release Time : 7/27/2024

Model Overview

This model is primarily used for image-to-text tasks, capable of generating detailed image descriptions. Fine-tuning on a large-scale image dataset has enhanced its annotation capabilities.

Model Features

High-Quality Image Annotation
Capable of generating detailed and accurate image descriptions, suitable for images of various themes.
Large-Scale Dataset Training
Fine-tuned on a subset of 40,000 images from the Ejafa/ye-pop dataset, improving the model's generalization ability.
Frozen Visual Encoder
The visual encoder was frozen during training, preserving the original model's visual feature extraction capabilities.

Model Capabilities

Image Description Generation
Multi-theme Image Analysis
High-Quality Text Output

Use Cases

Image Annotation
Detailed Image Description
Generates detailed textual descriptions for images, suitable for content management and retrieval.
Produces descriptive text including details such as colors, shapes, backgrounds, etc.
Content Management
Automated Image Tagging
Automatically generates tags for large volumes of images, improving content management efficiency.
Quickly generates accurate image tags, reducing manual annotation workload.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase