V

Videomae Large Finetuned Kinetics

Developed by MCG-NJU
VideoMAE is a self-supervised video pre-training model based on masked autoencoder, fine-tuned on the Kinetics-400 dataset for video classification tasks.
Downloads 4,657
Release Time : 8/2/2022

Model Overview

This model is pre-trained in a self-supervised manner and fine-tuned under supervision on Kinetics-400, capable of classifying videos into 400 possible categories.

Model Features

Self-Supervised Pre-Training
Uses masked autoencoder (MAE) method for self-supervised video pre-training, with high data efficiency
Strong Video Understanding Capability
Demonstrates excellent video classification performance after fine-tuning on the Kinetics-400 dataset
Transformer Architecture
Based on Vision Transformer architecture, effectively processes video sequence data

Model Capabilities

Video Classification
Video Feature Extraction
Video Content Understanding

Use Cases

Video Content Analysis
Video Classification
Classifies videos into one of the 400 Kinetics-400 categories
Achieves 84.7% top-1 accuracy on the Kinetics-400 test set
Video Content Understanding
Extracts high-level feature representations of videos
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase