-
UCSD Pedestrian
The dataset used for local anomaly detection in videos using object-centric adversarial learning. -
UCF101, HMDB51, Olympic Sports Dataset
UCF101, HMDB51, Olympic Sports Dataset -
RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Story...
RealityTalk is a system that augments live video presentations with speech-driven interactive virtual elements. -
Kinetics and Something-Something V2 datasets
The dataset used in the paper for few-shot video classification, containing videos from Kinetics and Something-Something V2 datasets. -
EDU dataset
The EDU dataset involves hand-crafted features extracted for 5,611 students across 10 weeks of data. -
HOT-2023 Dataset
The HOT-2023 dataset is a hyperspectral object tracking dataset, which consists of 110 training videos and 87 validation videos. The dataset includes 16, 15, and 25 bands for... -
HOT-2022 Dataset
The HOT-2022 dataset is a hyperspectral object tracking dataset, which consists of 40 training videos and 35 testing videos. The dataset includes 16 bands and has a resolution... -
DML-iTrack-HDR dataset
A dataset of eye-tracking experiments for HDR videos -
EPIC-KITCHENS
EPIC-KITCHENS is a large-scale egocentric video benchmark recorded by 32 participants in their native kitchen environments. Our videos depict non-scripted daily activities: we... -
RF PIX2PIX Unsupervised Wi-Fi to Video Translation
RF PIX2PIX Unsupervised Wi-Fi to Video Translation -
KTH action dataset
The KTH action dataset consists of humans performing 6 types of actions: boxing, clapping, waving, jogging, running, and walking under 4 scenarios: outdoors, outdoors with scale... -
Chinese Sign Language
The Chinese Sign Language (CSL) dataset is a dataset of Chinese sign language videos. It contains 100 Chinese daily expressions, each demonstrated 5 times by 50 presenters. -
Amazon-Video
Amazon-Video dataset is a dataset of user-item interactions from the Amazon video streaming service, which is used for evaluating the performance of recommendation algorithms. -
Temporal Cross-attention for Action Recognition
Feature shifts have been shown to be useful for action recognition with CNN-based models since Temporal Shift Module (TSM) was proposed. It is based on frame-wise feature...