-
Penn Action
The Penn Action dataset is a real video dataset of people performing various indoor and outdoor sports with annotations of human joint locations. -
MMX-Trailer-20 Dataset
Long form video understanding (LVU) is a sub-domain of video recognition concerned with understanding contextual information across contiguous shots which can contain multiple... -
LocalStyleFool
LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything -
Mini-Kinetics
The Mini-Kinetics dataset is a mini version of the Kinetics-400 dataset, containing 240k training samples and 20k validation samples in 400 human action classes. -
HMDB51 dataset
The HMDB51 dataset is a video dataset for human action recognition. It contains 6,767 videos annotated with 51 categories of human actions. -
Kinetics-700 dataset
The Kinetics-700 dataset is a large-scale video dataset for human action recognition. It contains 555,774 videos annotated with 700 categories of human actions. -
Moments in Time
The Moments in Time dataset is a large-scale video action recognition dataset. -
MoViNets: Mobile Video Networks for Efficient Video Recognition
Mobile Video Networks (MoViNets) is a family of computation and memory efficient video networks that can operate on streaming video for online inference. -
Something-Something V1
Video classification is a fundamental problem in many video-based tasks. Applications such as autonomous driving technology, controlling drones and robots are driving the demand... -
Kinetics-600
The Kinetics-600 dataset consists of 392k training videos and 30k validation videos in 600 human action categories. -
Kinetics-400
Motion has shown to be useful for video understanding, where motion is typically represented by optical flow. However, computing flow from video frames is very time-consuming.... -
Something-Something V1 & V2
The Something-Something V1 & V2 dataset is a large-scale video dataset created by crowdsourcing. It contains about 100k videos over 174 categories, and the number of videos...