V

Videomae Small Finetuned Ssv2

Developed by MCG-NJU
VideoMAE is a self-supervised pretrained video model based on Masked Autoencoder (MAE), fine-tuned on the Something-Something V2 dataset for video classification tasks.
Downloads 140
Release Time : 4/16/2023

Model Overview

This model was pretrained in a self-supervised manner for 2400 epochs and then supervised fine-tuned on the Something-Something V2 dataset, capable of classifying videos into one of 174 labels.

Model Features

Self-supervised Pretraining
Utilizes Masked Autoencoder (MAE) method for self-supervised pretraining, effectively learning internal video representations
Efficient Video Processing
Processes videos into fixed-size patch sequences, efficiently handled via Transformer architecture
SSV2 Dataset Fine-tuning
Fine-tuned on the Something-Something V2 dataset, specifically designed for action recognition tasks

Model Capabilities

Video Classification
Action Recognition
Feature Extraction

Use Cases

Video Understanding
Action Recognition
Identify human actions and behaviors in videos
Achieves 66.8% top-1 accuracy on the SSV2 test set
Video Content Analysis
Analyze video content and automatically classify
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase