-
FiGCLIP: Fine-Grained CLIP Adaptation via Densely Annotated Videos
Fine-grained adaptation of the popular CLIP model across multiple datasets. -
Motion-Guided Masking for Spatiotemporal Representation Learning
The authors used several video benchmarks, including Kinetics-400 and Something-Something V2, to evaluate their proposed motion-guided masking algorithm. -
Kinetics-400, UCF101, HMDB51, Something-Something V1, and Something-Something V2
The Kinetics-400, UCF101, HMDB51, Something-Something V1, and Something-Something V2 datasets are used for evaluating the performance of the Bi-Calibration Networks. -
Kinetics-400
Motion has shown to be useful for video understanding, where motion is typically represented by optical flow. However, computing flow from video frames is very time-consuming....