# Register-enhanced ViT
Dinov2 With Registers Base Imagenet1k 1 Layer
Apache-2.0
A vision transformer model based on the Transformer architecture, trained using the DINOv2 method and introducing a register mechanism to solve the artifact problem of traditional ViT models.
Image Classification
Transformers

D
facebook
693
2
Dinov2 With Registers Small Imagenet1k 1 Layer
Apache-2.0
A vision Transformer model trained with DINOv2, improved by adding register tokens to enhance attention mechanism, eliminate artifacts, and boost performance
Image Classification
Transformers

D
facebook
445
2
Vit Small Patch14 Reg4 Dinov2.lvd142m
Apache-2.0
A visual Transformer (ViT) image feature model with registers, pre-trained using the self-supervised DINOv2 method on the LVD-142M dataset.
Image Classification
Transformers

V
timm
15.98k
5
Vit Large Patch14 Reg4 Dinov2.lvd142m
Apache-2.0
A Vision Transformer (ViT) image feature model with registers, pre-trained using self-supervised DINOv2 method on the LVD-142M dataset.
Image Classification
Transformers

V
timm
119.48k
7
Vit Giant Patch14 Reg4 Dinov2.lvd142m
Apache-2.0
A vision Transformer (ViT) image feature model with registers, pretrained using the self-supervised DINOv2 method on the LVD-142M dataset.
Image Classification
Transformers

V
timm
917
1
Vit Base Patch14 Reg4 Dinov2.lvd142m
Apache-2.0
A visual transformer (ViT) image feature model with registers, pre-trained using the self-supervised DINOv2 method on the LVD-142M dataset.
Image Classification
Transformers

V
timm
40.95k
10
Featured Recommended AI Models
Qwen2.5 VL 7B Abliterated Caption It I1 GGUF
Apache-2.0
Quantized version of Qwen2.5-VL-7B-Abliterated-Caption-it, supporting multilingual image description tasks.
Image-to-Text
Transformers Supports Multiple Languages

Q
mradermacher
167
1
Nunchaku Flux.1 Dev Colossus
Other
The Nunchaku quantized version of the Colossus Project Flux, designed to generate high-quality images based on text prompts. This model minimizes performance loss while optimizing inference efficiency.
Image Generation English
N
nunchaku-tech
235
3
Qwen2.5 VL 7B Abliterated Caption It GGUF
Apache-2.0
This is a static quantized version based on the Qwen2.5-VL-7B model, focusing on image captioning generation tasks and supporting multiple languages.
Image-to-Text
Transformers Supports Multiple Languages

Q
mradermacher
133
1
Olmocr 7B 0725 FP8
Apache-2.0
olmOCR-7B-0725-FP8 is a document OCR model based on the Qwen2.5-VL-7B-Instruct model. It is fine-tuned using the olmOCR-mix-0225 dataset and then quantized to the FP8 version.
Image-to-Text
Transformers English

O
allenai
881
3
Lucy 128k GGUF
Apache-2.0
Lucy-128k is a model developed based on Qwen3-1.7B, focusing on proxy-based web search and lightweight browsing, and can run efficiently on mobile devices.
Large Language Model
Transformers English

L
Mungert
263
2