🚀 ResNet-50 v1.5
A ResNet model pre-trained on ImageNet-1k at a resolution of 224x224, introduced in the paper Deep Residual Learning for Image Recognition by He et al.
Disclaimer: The team releasing ResNet did not write a model card for this model, so this model card has been written by the Hugging Face team.
✨ Features
- Vision and Image Classification: This model is designed for vision tasks, specifically image classification.
- Pre-trained on ImageNet-1k: It has been pre-trained on the ImageNet-1k dataset at a resolution of 224x224.
- Residual Learning and Skip Connections: ResNet democratized the concepts of residual learning and skip connections, enabling the training of much deeper models.
- ResNet v1.5: This version differs from the original model, offering slightly higher accuracy (~0.5% top1) but with a small performance drawback (~5% imgs/sec) according to Nvidia.
Property |
Details |
Model Type |
Convolutional Neural Network (ResNet v1.5) |
Training Data |
ImageNet-1k |
📚 Documentation
Model description
ResNet (Residual Network) is a convolutional neural network that popularized the concepts of residual learning and skip connections, allowing for the training of much deeper models.
This is ResNet v1.5, which differs from the original model. In the bottleneck blocks requiring downsampling, v1 has a stride of 2 in the first 1x1 convolution, while v1.5 has a stride of 2 in the 3x3 convolution. This difference makes ResNet50 v1.5 slightly more accurate (~0.5% top1) than v1 but comes with a small performance drawback (~5% imgs/sec) according to Nvidia.

Intended uses & limitations
You can use the raw model for image classification. Check the model hub to find fine-tuned versions for tasks that interest you.
💻 Usage Examples
Basic Usage
from transformers import AutoImageProcessor, ResNetForImageClassification
import torch
from datasets import load_dataset
dataset = load_dataset("huggingface/cats-image")
image = dataset["test"]["image"][0]
processor = AutoImageProcessor.from_pretrained("microsoft/resnet-50")
model = ResNetForImageClassification.from_pretrained("microsoft/resnet-50")
inputs = processor(image, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
predicted_label = logits.argmax(-1).item()
print(model.config.id2label[predicted_label])
For more code examples, refer to the documentation.
BibTeX entry and citation info
@inproceedings{he2016deep,
title={Deep residual learning for image recognition},
author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={770--778},
year={2016}
}
📄 License
This model is licensed under the Apache-2.0 license.