AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
High-precision image understanding

# High-precision image understanding

Clip Vitb16 Test Time Registers
A vision-language model based on the OpenCLIP-ViT-B-16 architecture. By introducing test-time registers to optimize the internal representation, it solves the problem of feature map artifacts.
Text-to-Image Transformers
C
amildravid4292
517
0
Convnext Xxlarge.clip Laion2b Soup
Apache-2.0
ConvNeXt-XXLarge image encoder based on the CLIP framework, trained by LAION, suitable for multimodal tasks
Image Classification Transformers
C
timm
220
0
CLIP Convnext Xxlarge Laion2b S34b B82k Augreg
MIT
CLIP ConvNeXt-XXLarge model trained on LAION-2B dataset, implemented with OpenCLIP framework, the first non-ViT architecture achieving >79% ImageNet zero-shot accuracy
Text-to-Image
C
laion
6,616
9
CLIP Convnext Xxlarge Laion2b S34b B82k Augreg Soup
MIT
CLIP ConvNeXt-XXLarge model trained on LAION-2B dataset using OpenCLIP framework, the first non-ViT image tower CLIP model achieving >79% ImageNet top-1 zero-shot accuracy
Text-to-Image
C
laion
9,412
22
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase