7 datasets found

Formats: JSON Tags: video grounding

Filter Results
  • TACoS Speech

    The TACoS Speech dataset contains a large amount of open-world videos with more shot transitions.
  • Charades-STA Speech

    The Charades-STA Speech dataset contains a large amount of open-world videos with more shot transitions.
  • ActivityNet Speech

    The ActivityNet Speech dataset contains a large amount of open-world videos with more shot transitions.
  • TACoS

    A dataset of videos with multiple sentence descriptions, used for activity recognition and video description tasks.
  • Charades-STA

    Charades-STA dataset contains 12,408/3720 segment-sentence pairs and 5338/1334 videos in training and test set, respectively.
  • Language-free Training for Zero-shot Video Grounding

    Given an untrimmed video and a language query, video grounding aims to localize the time interval by understanding the text and video simultaneously.
  • ActivityNet Captions

    The ActivityNet Captions is a benchmark dataset proposed for dense video captioning. There are 20K untrimmed videos in total, and each video has several annotated segments with...
You can also access this registry using the API (see API Docs).