-
Charades-STA dataset
Temporal grounding of activities, the identification of specific time intervals of actions within a larger event context, is a critical task in video understanding. -
QVHighlights
QVHighlights is a dataset for video highlight detection, which consists of over 10,000 videos annotated with human-written text queries.