đ VideoMAE-v2 (base-sized model, Pretrained on UnlabeledHybrid-1M)
The VideoMAEv2-Base model is pre-trained in a self-supervised manner for 800 epochs on the UnlabeldHybrid-1M dataset. It offers a powerful solution for video classification and feature extraction.
đ Quick Start
The VideoMAEv2-Base model was pre-trained for 800 epochs in a self-supervised way on the UnlabeldHybrid-1M dataset. It was introduced in the paper [CVPR23]VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking by Wang et al. and first released in GitHub.
⨠Features
- The model can be used for video feature extraction.
đĻ Installation
No specific installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
Basic Usage
Here is how to use this model to extract a video feature:
from transformers import VideoMAEImageProcessor, AutoModel, AutoConfig
import numpy as np
import torch
config = AutoConfig.from_pretrained("OpenGVLab/VideoMAEv2-Base", trust_remote_code=True)
processor = VideoMAEImageProcessor.from_pretrained("OpenGVLab/VideoMAEv2-Base")
model = AutoModel.from_pretrained('OpenGVLab/VideoMAEv2-Base', config=config, trust_remote_code=True)
video = list(np.random.rand(16, 3, 224, 224))
inputs = processor(video, return_tensors="pt")
inputs['pixel_values'] = inputs['pixel_values'].permute(0, 2, 1, 3, 4)
with torch.no_grad():
outputs = model(**inputs)
đ Documentation
Intended uses & limitations
You can use the raw model for video feature extraction.
BibTeX entry and citation info
@InProceedings{wang2023videomaev2,
author = {Wang, Limin and Huang, Bingkun and Zhao, Zhiyu and Tong, Zhan and He, Yinan and Wang, Yi and Wang, Yali and Qiao, Yu},
title = {VideoMAE V2: Scaling Video Masked Autoencoders With Dual Masking},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
pages = {14549-14560}
}
@misc{videomaev2,
title={VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking},
author={Limin Wang and Bingkun Huang and Zhiyu Zhao and Zhan Tong and Yinan He and Yi Wang and Yali Wang and Yu Qiao},
year={2023},
eprint={2303.16727},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
đ License
This project is licensed under the CC BY-NC 4.0 license.