Timesformer Hr Finetuned Ssv2
TimeSformer is a video classification model based on spatio-temporal attention mechanism, fine-tuned on the Something Something v2 dataset.
Downloads 14
Release Time : 12/10/2022
Model Overview
This model is primarily used for video classification tasks, capable of classifying videos into one of 174 possible Something Something v2 labels.
Model Features
Spatio-Temporal Attention Mechanism
Uses pure attention mechanism to process spatio-temporal information in videos, eliminating the need for convolution operations
High-Resolution Processing
Supports high-resolution video input (448x448 pixels)
End-to-End Training
Learns directly from raw video frames without manual feature extraction
Model Capabilities
Video Classification
Spatio-Temporal Feature Extraction
Action Recognition
Use Cases
Video Understanding
Action Recognition
Identify human actions and behaviors in videos
Can classify 174 different action categories
Video Content Analysis
Analyze video content and automatically classify
Featured Recommended AI Models
Š 2025AIbase