1 dataset found

Formats: JSON Tags: Multimodal Large Language Models

Filter Results
  • Charades-STA dataset

    Temporal grounding of activities, the identification of specific time intervals of actions within a larger event context, is a critical task in video understanding.
You can also access this registry using the API (see API Docs).