You're currently viewing an old version of this dataset. To see the current version, click here.

Microsoft Video Description Corpus (MSVD)

The MSVD dataset is a public video captioning benchmark that contains 1,970 short video clips with 80,000 descriptions.

Data and Resources

This dataset has no data

Yuyu Guo, Jingqiu Zhang, Lianli Gao (2024). Dataset: Microsoft Video Description Corpus (MSVD). https://doi.org/10.57702/0two3jmq

Private DOI This DOI is not yet resolvable.
It is available for use in manuscripts, and will be published when the Dataset is made public.

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Author	Yuyu Guo
More Authors	Jingqiu Zhang Lianli Gao
Homepage	https://www.microsoft.com/en-us/research/project/msvd-video-description-corpus