Timesformer Base Finetuned K400
TimeSformer is a video classification model based on spatio-temporal attention mechanism, specifically fine-tuned for the Kinetics-400 dataset.
Downloads 17
Release Time : 12/10/2022
Model Overview
This model is used to classify videos into one of 400 possible Kinetics-400 labels, employing pure attention mechanisms to process video data without convolutional operations.
Model Features
Pure Attention Mechanism
Processes videos entirely based on spatio-temporal attention mechanisms, eliminating traditional convolutional operations
Efficient Video Understanding
Handles long video sequences efficiently through decomposed spatial and temporal attention mechanisms
Large-scale Pretraining
Pretrained and fine-tuned on the large-scale Kinetics-400 video dataset
Model Capabilities
Video Classification
Spatio-Temporal Feature Extraction
Action Recognition
Use Cases
Video Content Analysis
Action Recognition
Identifies human actions and behaviors in videos
Can recognize 400 different action categories
Video Content Classification
Automatically classifies and tags video content
Featured Recommended AI Models
Š 2025AIbase