đ Vision Transformer (ViT) for Facial Expression Recognition Model Card
This model card presents a Vision Transformer (ViT) fine - tuned for facial expression recognition, offering insights into its architecture, training data, and performance.
đ Quick Start
This model is readily accessible on the Hugging Face platform. You can utilize it for facial expression recognition tasks by following the steps provided in the official Hugging Face documentation.
⨠Features
- Task - Specific Fine - Tuning: The
vit - face - expression
model is specifically fine - tuned for facial emotion recognition, enabling accurate identification of seven different emotions.
- Data Augmentation: During training, data augmentation techniques such as rotations, flips, and zooms are applied to enhance the model's generalization ability.
đĻ Installation
There is no specific installation step provided in the original README. If you want to use this model, you can install the necessary dependencies through the Hugging Face Transformers library:
pip install transformers
đ Documentation
Model Overview
- Model Name: [trpakov/vit - face - expression](https://huggingface.co/trpakov/vit - face - expression)
- Task: Facial Expression/Emotion Recognition
- Dataset: FER2013
- Model Architecture: Vision Transformer (ViT)
- Finetuned from model: [vit - base - patch16 - 224 - in21k](https://huggingface.co/google/vit - base - patch16 - 224 - in21k)
Model Description
The vit - face - expression
model is a Vision Transformer fine - tuned for the task of facial emotion recognition. It is trained on the FER2013 dataset, which consists of facial images categorized into seven different emotions:
- Angry
- Disgust
- Fear
- Happy
- Sad
- Surprise
- Neutral
Data Preprocessing
The input images are preprocessed before being fed into the model. The preprocessing steps include:
- Resizing: Images are resized to the specified input size.
- Normalization: Pixel values are normalized to a specific range.
- Data Augmentation: Random transformations such as rotations, flips, and zooms are applied to augment the training dataset.
Evaluation Metrics
- Validation set accuracy: 0.7113
- Test set accuracy: 0.7116
Limitations
- Data Bias: The model's performance may be influenced by biases present in the training data.
- Generalization: The model's ability to generalize to unseen data is subject to the diversity of the training dataset.
đ License
This model is released under the Apache 2.0 license.