-
Motion-Guided Masking for Spatiotemporal Representation Learning
The authors used several video benchmarks, including Kinetics-400 and Something-Something V2, to evaluate their proposed motion-guided masking algorithm. -
Kinetics-400 and Kinetics-600
The Kinetics-400 and Kinetics-600 datasets are video understanding datasets used for learning rich and multi-scale spatiotemporal semantics from high-dimensional videos. -
Kinetics-400
Motion has shown to be useful for video understanding, where motion is typically represented by optical flow. However, computing flow from video frames is very time-consuming....