V

Vit Base R50 S16 224.orig In21k

Developed by timm
A hybrid image classification model combining ResNet and Vision Transformer, pre-trained on ImageNet-21k, suitable for feature extraction and fine-tuning scenarios.
Downloads 876
Release Time : 12/23/2022

Model Overview

This model is a hybrid image classification model that combines ResNet and Vision Transformer (ViT). It was pre-trained on ImageNet-21k by the paper authors in the JAX framework and later ported to PyTorch. It does not include a classification head and is suitable for feature extraction and fine-tuning.

Model Features

Hybrid Architecture
Combines the strengths of ResNet and Vision Transformer to enhance image feature extraction capabilities.
Pre-trained Model
Pre-trained on the large-scale ImageNet-21k dataset, offering robust feature extraction capabilities.
Flexible Application
Does not include a classification head, making it suitable for feature extraction and fine-tuning scenarios.

Model Capabilities

Image Classification
Image Feature Extraction

Use Cases

Computer Vision
Image Classification
Use this model for image classification tasks, supporting recognition of multiple categories.
Feature Extraction
Extract high-level image features for subsequent tasks such as object detection and image segmentation.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase