V

Videomae Small Finetuned Kinetics

Developed by MCG-NJU
VideoMAE is a masked autoencoder model for video, pretrained with self-supervision and fine-tuned on the Kinetics-400 dataset, suitable for video classification tasks.
Downloads 2,152
Release Time : 4/16/2023

Model Overview

This model is based on a masked autoencoder architecture, specifically designed for video classification tasks, capable of recognizing 400 action categories in the Kinetics-400 dataset.

Model Features

Self-supervised Pretraining
Learns internal video representations through 1600 epochs of self-supervised pretraining.
Efficient Video Classification
After fine-tuning on the Kinetics-400 dataset, it can accurately recognize 400 action categories.
Masked Autoencoder Architecture
Uses a masked autoencoder approach for video pretraining, improving data efficiency.

Model Capabilities

Video Classification
Action Recognition
Video Feature Extraction

Use Cases

Video Content Analysis
Action Recognition
Recognize human actions in videos
Achieves 79.0 top-1 accuracy on the Kinetics-400 test set
Video Classification
Classify videos into 400 predefined categories
Achieves 93.8 top-5 accuracy on the Kinetics-400 test set
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase