Webvid-10M

The dataset used for training the video model consists of Webvid-10M, a large-scale dataset of short videos with textual descriptions.

Data and Resources

Cite this as

Max Bain, Arsha Nagrani, G¨ul Varol, Andrew Zisserman (2024). Dataset: Webvid-10M. https://doi.org/10.57702/lral74s0

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 3, 2024
Defined In https://doi.org/10.48550/arXiv.2310.19512
Author Max Bain
More Authors
Arsha Nagrani
G¨ul Varol
Andrew Zisserman
Homepage https://arxiv.org/abs/2106.10933