-
Something-Something
The Something-Something dataset consists of 174 fine-grained action categories that depict humans performing everyday actions with common objects. Recognizing actions in the... -
Something-Something-V2
The Something-Something-V2 dataset is a large-scale video action recognition dataset. -
Something-Something v2 (SSv2)
The Something-Something v2 (SSv2) dataset is a large collection of video clips of humans performing actions with everyday objects. -
Fine-tuned CLIP Models are Efficient Video Learners
This work explores the capability of a simple baseline called ViFi-CLIP (Video Fine-tuned CLIP) for adapting image-based CLIP to video domain. -
Mini-Kinetics
The Mini-Kinetics dataset is a mini version of the Kinetics-400 dataset, containing 240k training samples and 20k validation samples in 400 human action classes. -
HMDB51 dataset
The HMDB51 dataset is a video dataset for human action recognition. It contains 6,767 videos annotated with 51 categories of human actions. -
Kinetics-700 dataset
The Kinetics-700 dataset is a large-scale video dataset for human action recognition. It contains 555,774 videos annotated with 700 categories of human actions. -
Kinetics400
Video classification is a fundamental problem in many video-based tasks. Applications such as autonomous driving technology, controlling drones and robots are driving the demand... -
EGTEA Gaze+
The EGTEA Gaze+ dataset offers approximately 10,000 samples of 106 non-scripted daily activities that occur in a kitchen. -
HMDB51 and UCF101
The dataset used in the paper is HMDB51 and UCF101. -
Kinetics-400 and Something-Something-V2
The dataset used in the paper is Kinetics-400 and Something-Something-V2. -
Kinetics dataset
The Kinetics dataset is a large-scale action recognition dataset. It contains videos of various actions performed by humans, with annotations of the actions performed. -
UCF-101 dataset
UCF-101 dataset is a large-scale action recognition dataset, containing 13,320 videos categorized into 101 human action categories. -
Kinetics-400 and Kinetics-600
The Kinetics-400 and Kinetics-600 datasets are video understanding datasets used for learning rich and multi-scale spatiotemporal semantics from high-dimensional videos. -
Kinetics-600
The Kinetics-600 dataset consists of 392k training videos and 30k validation videos in 600 human action categories. -
Kinetics-400
Motion has shown to be useful for video understanding, where motion is typically represented by optical flow. However, computing flow from video frames is very time-consuming....