-
Charades-STA dataset
Temporal grounding of activities, the identification of specific time intervals of actions within a larger event context, is a critical task in video understanding. -
Localizing moments in video with natural language
Localizing moments in video with natural language