-
InterVid-14M-aesthetics
The dataset used in the paper is InterVid-14M-aesthetics, which is a subset of InterVid-14M used to remove watermarks from generated videos. -
Kinetics-400 and Kinetics-600
The Kinetics-400 and Kinetics-600 datasets are video understanding datasets used for learning rich and multi-scale spatiotemporal semantics from high-dimensional videos.