đ Chest X-ray Image Classifier
This repository holds a fine - tuned Vision Transformer (ViT) model for classifying chest X - ray images. It uses the CheXpert dataset. The model is fine - tuned to classify various lung diseases from chest radiographs and can accurately distinguish different conditions.
⨠Features
- Based on the Vision Transformer (ViT) architecture, which uses attention mechanisms for efficient feature extraction in image - based tasks.
- Trained on the CheXpert dataset with labeled chest X - ray images for detecting diseases like pneumonia and cardiomegaly.
- Achieved a high final validation accuracy of 98.46% during training.
đĻ Installation
No installation steps were provided in the original document, so this section is skipped.
đģ Usage Examples
Basic Usage
To use the fine - tuned model for inference, simply load the model from Hugging Face's Model Hub and input a chest X - ray image:
from PIL import Image
import torch
from transformers import AutoImageProcessor, AutoModelForImageClassification
processor = AutoImageProcessor.from_pretrained("codewithdark/vit-chest-xray")
model = AutoModelForImageClassification.from_pretrained("codewithdark/vit-chest-xray")
label_columns = ['Cardiomegaly', 'Edema', 'Consolidation', 'Pneumonia', 'No Finding']
image_path = "/content/images.jpeg"
image = Image.open(image_path)
if image.mode != 'RGB':
image = image.convert('RGB')
print("Image converted to RGB.")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predicted_class_idx = torch.argmax(logits, dim=-1).item()
predicted_class_label = label_columns[predicted_class_idx]
print(f"Predicted Class Index: {predicted_class_idx}")
print(f"Predicted Class Label: {predicted_class_label}")
'''
Output :
Predicted Class Index: 4
Predicted Class Label: No Finding
'''
Advanced Usage
To fine - tune the model on your own dataset, you can follow the instructions in this repo to adapt the code to your dataset and training configuration.
đ Documentation
Model Overview
The fine - tuned model is based on the Vision Transformer (ViT) architecture, which excels in handling image - based tasks by leveraging attention mechanisms for efficient feature extraction. The model was trained on the CheXpert dataset, which consists of labeled chest X - ray images for detecting diseases such as pneumonia, cardiomegaly, and others.
Performance
- Final Validation Accuracy: 98.46%
- Final Training Loss: 0.1069
- Final Validation Loss: 0.0980
The model achieved a significant accuracy improvement during training, demonstrating its ability to generalize well to unseen chest X - ray images.
Dataset
The dataset used for fine - tuning the model is the CheXpert dataset, which includes chest X - ray images from various patients with multi - label annotations. The data includes frontal and lateral views of the chest for each patient, annotated with labels for various lung diseases.
For more details on the dataset, visit the CheXpert official website.
Training Details
The model was fine - tuned using the following settings:
- Optimizer: AdamW
- Learning Rate: 3e - 5
- Batch Size: 32
- Epochs: 10
- Loss Function: Binary Cross - Entropy with Logits
- Precision: Mixed precision (via
torch.amp
)
đ§ Technical Details
The model uses the Vision Transformer (ViT) architecture, which is well - suited for image - based tasks. By leveraging attention mechanisms, it can efficiently extract features from chest X - ray images. The model was trained on the CheXpert dataset, which provides a large number of labeled chest X - ray images for training and validation.
đ License
This model is available under the MIT License. See LICENSE for more details.
Acknowledgements
- CheXpert Dataset
- Hugging Face for providing the
transformers
library and Model Hub.
Happy coding! đ