-
Charades-STA dataset
Temporal grounding of activities, the identification of specific time intervals of actions within a larger event context, is a critical task in video understanding. -
PLOT-TAL - Prompt Learning with Optimal Transport for Few-Shot Temporal Actio...
Temporal Action Localization (TAL) in few-shot learning. Our work addresses the inherent limitations of conventional single-prompt learning methods that often lead to... -
ActivityNet, MSR-VTT, and MSVD
The dataset used in the paper is ActivityNet, MSR-VTT, and MSVD. The authors used these datasets for text-to-video retrieval tasks. -
Visual Semantic Role Labeling for Video Understanding
Visual Semantic Role Labeling for Video Understanding. -
Kinetics-400, Something-Something-V2, and Epic-Kitchens-100
The authors used the Kinetics-400, Something-Something-V2, and Epic-Kitchens-100 datasets for video understanding tasks.