-
Learning to Predict Situation Hyper-Graphs for Video Question Answering
The SHG-VQA model predicts a situation hyper-graph structure composed of existing actions and relations in the input video. -
Action Search: Spotting Actions in Videos
This paper proposes a method for action search in videos, which is used for spotting actions in videos. -
Breakfast dataset
The Breakfast dataset is another dataset used in the paper, which contains 712 videos of people performing various activities, such as making coffee or scrambling eggs. The... -
AViD Dataset: Anonymized Videos from Diverse Countries
AViD is a new public video dataset for action recognition, containing action videos from diverse countries. -
THUMOS'13 Dataset
The THUMOS'13 dataset contains 24 classes and 3,207 videos. -
JHMDB Dataset
The JHMDB dataset contains a varying number of actions (21 in JHMDB) across different domains (sports and daily activities). -
UCF Sports Dataset
The UCF Sports dataset contains a varying number of actions (10 in UCF Sports, 21 in JHMDB, and 24 in THUMOS’13) across different domains (sports and daily activities). -
THUMOS'14, ActivityNet v1.3
Temporal action detection in untrimmed videos via multi-stage cnns, Cdc: convolutional-de-convolutional networks for precise temporal action localization, Temporal action... -
SCUBA and SCUFO
The dataset used in the paper to evaluate static bias in action representations. -
ActivityNet-1.3
Generating human action proposals in untrimmed videos is an important yet challenging task with wide applications. Current methods often suffer from the noisy boundary locations... -
UCF-24 and JHMDB-21
UCF-24 and JHMDB-21 are two public action datasets used for evaluation of action detection algorithms. -
Kinetics dataset
The Kinetics dataset is a large-scale action recognition dataset. It contains videos of various actions performed by humans, with annotations of the actions performed. -
UCF-101 dataset
UCF-101 dataset is a large-scale action recognition dataset, containing 13,320 videos categorized into 101 human action categories. -
Kinetics-600
The Kinetics-600 dataset consists of 392k training videos and 30k validation videos in 600 human action categories. -
Kinetics Human Action Video Dataset
The Kinetics dataset is a large-scale video dataset for human action recognition. -
Kinetics-400
Motion has shown to be useful for video understanding, where motion is typically represented by optical flow. However, computing flow from video frames is very time-consuming.... -
Motion-driven Visual Tempo Learning for Video-based Action Recognition
The proposed Temporal Correlation Module (TCM) to deal with the variation of action visual tempo in videos, which includes a Multi-scale Temporal Dynamics Module (MTDM) and a...