Model Selection

Register-enhanced ViT

# Register-enhanced ViT

Dinov2 With Registers Base Imagenet1k 1 Layer

A vision transformer model based on the Transformer architecture, trained using the DINOv2 method and introducing a register mechanism to solve the artifact problem of traditional ViT models.

Image Classification

Dinov2 With Registers Small Imagenet1k 1 Layer

A vision Transformer model trained with DINOv2, improved by adding register tokens to enhance attention mechanism, eliminate artifacts, and boost performance

Image Classification

Vit Small Patch14 Reg4 Dinov2.lvd142m

A visual Transformer (ViT) image feature model with registers, pre-trained using the self-supervised DINOv2 method on the LVD-142M dataset.

Image Classification

Vit Large Patch14 Reg4 Dinov2.lvd142m

A Vision Transformer (ViT) image feature model with registers, pre-trained using self-supervised DINOv2 method on the LVD-142M dataset.

Image Classification

Vit Giant Patch14 Reg4 Dinov2.lvd142m

A vision Transformer (ViT) image feature model with registers, pretrained using the self-supervised DINOv2 method on the LVD-142M dataset.

Image Classification

Vit Base Patch14 Reg4 Dinov2.lvd142m

A visual transformer (ViT) image feature model with registers, pre-trained using the self-supervised DINOv2 method on the LVD-142M dataset.

Image Classification

Featured Recommended AI Models

Qwen2.5 VL 7B Abliterated Caption It I1 GGUF

Quantized version of Qwen2.5-VL-7B-Abliterated-Caption-it, supporting multilingual image description tasks.

Transformers Supports Multiple Languages

Nunchaku Flux.1 Dev Colossus

The Nunchaku quantized version of the Colossus Project Flux, designed to generate high-quality images based on text prompts. This model minimizes performance loss while optimizing inference efficiency.

Image Generation English

Qwen2.5 VL 7B Abliterated Caption It GGUF

This is a static quantized version based on the Qwen2.5-VL-7B model, focusing on image captioning generation tasks and supporting multiple languages.

Transformers Supports Multiple Languages

Olmocr 7B 0725 FP8

olmOCR-7B-0725-FP8 is a document OCR model based on the Qwen2.5-VL-7B-Instruct model. It is fine-tuned using the olmOCR-mix-0225 dataset and then quantized to the FP8 version.

Transformers English

Lucy-128k is a model developed based on Qwen3-1.7B, focusing on proxy-based web search and lightweight browsing, and can run efficiently on mobile devices.

Large Language Model

Transformers English

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase