ActivityNet, MSR-VTT, and MSVD

The dataset used in the paper is ActivityNet, MSR-VTT, and MSVD. The authors used these datasets for text-to-video retrieval tasks.

BibTex: