-
50 Salads dataset
The dataset used in the paper is the 50 Salads dataset, which contains 50 videos showing people preparing salads. The dataset is used for dense action forecasting, and the... -
Breakfast dataset
The Breakfast dataset is another dataset used in the paper, which contains 712 videos of people performing various activities, such as making coffee or scrambling eggs. The... -
Video-based Activity Recognition Dataset
The dataset used in the paper is a video-based activity recognition dataset, consisting of 90 video sequences spanning three scenarios: SHORT MEAL, HAVE SNACK and NORMAL MEAL. -
Soccer video processing for the detection of advertisement billboards
A dataset for detecting advertisement billboards in soccer video scenes. -
ADNet: A Deep Network for Detecting Adverts
A composite dataset consisting of positive- and negative- examples of billboard detection. -
AViD Dataset: Anonymized Videos from Diverse Countries
AViD is a new public video dataset for action recognition, containing action videos from diverse countries. -
Vcdb: A Large-Scale Database for Partial Copy Detection in Videos
A large-scale database for partial copy detection in videos. -
Trending YouTube Video Thumbnails
The dataset is a collection of YouTube trending videos with their thumbnails, metadata, and object labels. -
Collective Activities Dataset
The Collective Activities dataset consists of 44 videos taken in unconstrained real-world scenarios of people performing 6 individual actions across 5 group activities. -
THUMOS'13 Dataset
The THUMOS'13 dataset contains 24 classes and 3,207 videos. -
JHMDB Dataset
The JHMDB dataset contains a varying number of actions (21 in JHMDB) across different domains (sports and daily activities). -
UCF Sports Dataset
The UCF Sports dataset contains a varying number of actions (10 in UCF Sports, 21 in JHMDB, and 24 in THUMOS’13) across different domains (sports and daily activities). -
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization i...
This paper proposes a video question answering model that effectively integrates multi-modal input sources and finds the temporally relevant information to answer questions. -
THUMOS'14, ActivityNet v1.3
Temporal action detection in untrimmed videos via multi-stage cnns, Cdc: convolutional-de-convolutional networks for precise temporal action localization, Temporal action... -
Tinyvideos dataset
Tinyvideos dataset -
VisDrone2019 Dataset
The VisDrone2019 dataset contains 288 video clips made up of 261,908 frames and 10,209 images