Timesformer Base Finetuned Ssv2
TimeSformer is a video understanding model based on spatio-temporal attention mechanism, fine-tuned on the Something Something v2 dataset for video classification tasks.
Downloads 379
Release Time : 10/7/2022
Model Overview
This model is primarily used to classify videos into one of 174 possible Something Something v2 labels, employing a pure attention mechanism to process video data.
Model Features
Pure Attention Mechanism
Completely based on spatio-temporal attention mechanism for processing video data, without convolutional operations
Efficient Video Understanding
Capable of effectively capturing spatio-temporal features in videos
Pre-training and Fine-tuning Paradigm
Fine-tuned on the Something Something v2 dataset, suitable for specific video classification tasks
Model Capabilities
Video Classification
Spatio-Temporal Feature Extraction
Video Content Understanding
Use Cases
Video Analysis
Action Recognition
Recognize human actions and behaviors in videos
Can classify 174 different action categories
Video Content Understanding
Understand object interactions and scene changes in videos
Featured Recommended AI Models
Š 2025AIbase