-
DFDC Preview
A dataset for deepfake detection challenge. -
Day-to-Day Video Dataset
A dataset of 30 videos of length 3 minutes to 20 minutes from five classes of daily activities: socializing, home repair, biking around urban areas, cooking, and home tours. -
ShareChat Video Posts Dataset
Dataset of video posts created in the Hindi language over a period of one week on the ShareChat application, capturing both implicit signals (such as video play, and skip) and... -
ActivityNet1.2
The ActivityNet1.2 dataset is a large-scale benchmark for action recognition and localization in videos. -
Temporal Sentence Grounding in Videos
Temporal sentence grounding in videos (TSGV) is a task to retrieve a video segment that semantically corresponds to a query in natural language. -
Skip-Clip: Self-Supervised Spatiotemporal Representation Learning by Future C...
A self-supervised spatiotemporal representation learning approach for videos, combining temporal coherence and future clip order ranking. -
SCUBA and SCUFO
The dataset used in the paper to evaluate static bias in action representations. -
Assembly101: a large-scale multi-view video dataset for understanding procedu...
Assembly101: a large-scale multi-view video dataset for understanding procedural activities -
YouTube-VIS Dataset
The YouTube-VIS dataset is a large-scale dataset for instance segmentation, containing 2,883 videos and 131K instance masks. -
InternVid: A Large-Scale Video-Text Dataset for Multimodal Understanding and ...
InternVid: A large-scale video-text dataset for multimodal understanding and generation. -
THUMOS Challenge: Action Recognition with a Large Number of Classes
THUMOS Challenge: Action Recognition with a Large Number of Classes. -
ActivityNet-1.3
Generating human action proposals in untrimmed videos is an important yet challenging task with wide applications. Current methods often suffer from the noisy boundary locations... -
SDU Fall Dataset
The dataset contains videos of people performing normal activities and falls. -
UR Dataset
The dataset contains videos of people performing normal activities and falls. -
Spatio-Temporal Adversarial Learning for Detecting Unseen Falls
The proposed spatio-temporal adversarial learning framework for detecting unseen falls from videos.