Model Selection

Image semantic encoding

# Image semantic encoding

Resnet50 Clip Gap.cc12m

CLIP-style image encoder based on ResNet50 architecture, trained on CC12M dataset, extracting features through Global Average Pooling (GAP)

Image Classification

Vit Large Patch16 224.mae

Large-scale image feature extraction model based on Vision Transformer (ViT), pre-trained on ImageNet-1k dataset using self-supervised Masked Autoencoder (MAE) method

Image Classification

Vit Base Patch16 224.mae

Vision Transformer (ViT) based image feature extraction model, pre-trained on ImageNet-1k dataset using self-supervised masked autoencoder (MAE) method

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase