4 datasets found

Tags: Video Moment Retrieval

Filter Results
  • Thumos14

    Video moment retrieval task, which only retrieves the video clip with certain action given a query.
  • ActivityNet v1.2

    Weakly-Supervised Temporal Action Localization (WSTAL) aims to localize actions in untrimmed videos with only video-level labels.
  • Charades-STA

    Charades-STA dataset contains 12,408/3720 segment-sentence pairs and 5338/1334 videos in training and test set, respectively.
  • ActivityNet Captions

    The ActivityNet Captions is a benchmark dataset proposed for dense video captioning. There are 20K untrimmed videos in total, and each video has several annotated segments with...
You can also access this registry using the API (see API Docs).