đ ResNet-34 v1.5
A ResNet model pre-trained on ImageNet-1k at a resolution of 224x224, introduced in the paper Deep Residual Learning for Image Recognition by He et al.
đ Quick Start
ResNet-34 v1.5 is a pre - trained model on ImageNet - 1k. It can be used for image classification tasks. You can load the model and feature extractor from the Hugging Face model hub and classify images as shown in the usage examples below.
⨠Features
- Residual Learning and Skip Connections: ResNet democratized the concepts of residual learning and skip connections, enabling the training of much deeper models.
- ResNet v1.5 Improvement: In the bottleneck blocks requiring downsampling, ResNet v1.5 has stride = 2 in the 3x3 convolution (compared to v1 which has stride = 2 in the first 1x1 convolution). This makes ResNet50 v1.5 slightly more accurate (~0.5% top1) than v1, though with a small performance drawback (~5% imgs/sec) according to Nvidia.
đ Documentation
Model description
ResNet (Residual Network) is a convolutional neural network that popularized the concepts of residual learning and skip connections, which allows for the training of much deeper models.
This is ResNet v1.5, different from the original model. In the bottleneck blocks that need downsampling, v1 has a stride of 2 in the first 1x1 convolution, while v1.5 has a stride of 2 in the 3x3 convolution. According to Nvidia, this difference makes ResNet50 v1.5 slightly more accurate (~0.5% top1) than v1, but comes with a small performance loss (~5% imgs/sec).

Intended uses & limitations
You can use the raw model for image classification. Check the model hub to find fine - tuned versions for tasks that interest you.
How to use
Basic Usage
from transformers import AutoFeatureExtractor, ResNetForImageClassification
import torch
from datasets import load_dataset
dataset = load_dataset("huggingface/cats-image")
image = dataset["test"]["image"][0]
feature_extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-34")
model = ResNetForImageClassification.from_pretrained("microsoft/resnet-34")
inputs = feature_extractor(image, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
predicted_label = logits.argmax(-1).item()
print(model.config.id2label[predicted_label])
For more code examples, refer to the documentation.
BibTeX entry and citation info
@inproceedings{he2016deep,
title={Deep residual learning for image recognition},
author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={770--778},
year={2016}
}
đ License
This model is released under the Apache - 2.0 license.
Information Table
Property |
Details |
Model Type |
ResNet-34 v1.5 |
Training Data |
ImageNet-1k |
Tags |
vision, image - classification |
Important Note
The team releasing ResNet did not write a model card for this model so this model card has been written by the Hugging Face team.