2 datasets found

Tags: text-video retrieval

Filter Results
  • DiDeMo

    The DiDeMo dataset is a large-scale video-text dataset, containing 10,000 videos and 40,000 annotations.
  • LSMDC

    The LSMDC movie description dataset consists of 118,081 short video clips extracted from 202 movies, each annotated with a single caption.
You can also access this registry using the API (see API Docs).