-
Kinetics-400 and Kinetics-600
The Kinetics-400 and Kinetics-600 datasets are video understanding datasets used for learning rich and multi-scale spatiotemporal semantics from high-dimensional videos. -
Kinetics-600
The Kinetics-600 dataset consists of 392k training videos and 30k validation videos in 600 human action categories. -
UCF101 dataset
UCF101 dataset is used to test the proposed text-to-video model. The dataset contains 101 action categories, and each category has 10 videos. The videos are labeled with text...