Timesformer Hr Finetuned K600
TimeSformer is a video classification model based on spatio-temporal attention mechanisms, specifically designed for video understanding tasks.
Downloads 2,927
Release Time : 10/7/2022
Model Overview
This model is pre-trained on the Kinetics-600 dataset and can classify videos into 600 possible categories. It employs a pure attention mechanism to process video data without convolutional operations.
Model Features
Pure Attention Mechanism
Fully based on Transformer architecture, using spatio-temporal attention to process video data without traditional convolutional operations.
Efficient Video Understanding
Specifically designed to handle video sequence data, capable of capturing spatio-temporal features.
Large-Scale Pre-training
Pre-trained on the large-scale Kinetics-600 video dataset, offering broad application capabilities.
Model Capabilities
Video Classification
Spatio-Temporal Feature Extraction
Video Content Understanding
Use Cases
Video Analysis
Action Recognition
Identify human actions and behaviors in videos
Can recognize 600 action categories from the Kinetics-600 dataset
Video Content Classification
Automatically classify and tag video content
Featured Recommended AI Models
Š 2025AIbase