V

Videomae Base Ssv2

Developed by MCG-NJU
VideoMAE is a self-supervised video pre-training model based on masked autoencoder, pre-trained for 2400 epochs on the Something-Something-v2 dataset.
Downloads 454
Release Time : 8/2/2022

Model Overview

This model learns internal video representations through self-supervision and is suitable for fine-tuning downstream tasks such as video classification.

Model Features

Self-Supervised Pre-training
Uses masked autoencoder method, requiring no labeled data for pre-training
Efficient Video Learning
Learns video representations through masked video block prediction tasks
ViT-Based Architecture
Adopts Vision Transformer architecture, suitable for processing video sequence data

Model Capabilities

Video Feature Extraction
Video Masked Block Prediction
Video Classification Task Fine-tuning

Use Cases

Video Understanding
Video Classification
Fine-tune the pre-trained model for video classification tasks
Video Representation Learning
Extract video features for downstream tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase