Dataset - LDM

TGIF-QA

The TGIF-QA dataset consists of 165165 QA pairs chosen from 71741 animated GIFs. To evaluate the spatiotemporal reasoning ability at the video level, TGIF-QA dataset designs...
- Dataset
- JSON
Temporally-Adaptive Convolutions for Video Understanding

Spatial convolutions are extensively used in numerous deep video models. It fundamentally assumes spatio-temporal invariance, i.e., using shared weights for every location in...
- Dataset
- JSON
VATEX

The dataset used in the paper is a video question answering dataset, which is a large-scale video-language pre-training task.
- Dataset
- JSON
BIT: Bi-Level Temporal Modeling for Efficient Supervised Action Segmentation

Action segmentation dataset for supervised action segmentation
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

4 datasets found