DiDeMo

The DiDeMo dataset is a large-scale video-text dataset, containing 10,000 videos and 40,000 annotations.

BibTex: