1 dataset found

Tags: multimodal learning

Filter Results
  • LSMDC

    The LSMDC movie description dataset consists of 118,081 short video clips extracted from 202 movies, each annotated with a single caption.