2 datasets found

Tags: natural language query

Filter Results
  • Charades-STA dataset

    Temporal grounding of activities, the identification of specific time intervals of actions within a larger event context, is a critical task in video understanding.
  • QVHighlights

    QVHighlights is a dataset for video highlight detection, which consists of over 10,000 videos annotated with human-written text queries.
You can also access this registry using the API (see API Docs).