Webvid-10M

The dataset used for training the video model consists of Webvid-10M, a large-scale dataset of short videos with textual descriptions.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Max Bain, Arsha Nagrani, G¨ul Varol, Andrew Zisserman (2024). Dataset: Webvid-10M. https://doi.org/10.57702/lral74s0

DOI retrieved: December 2, 2024

Field	Value
Created	December 2, 2024
Last update	December 3, 2024
Defined In	https://doi.org/10.48550/arXiv.2310.19512
Author	Max Bain
More Authors	Arsha Nagrani G¨ul Varol Andrew Zisserman
Homepage	https://arxiv.org/abs/2106.10933