-
Temporally-Adaptive Convolutions for Video Understanding
Spatial convolutions are extensively used in numerous deep video models. It fundamentally assumes spatio-temporal invariance, i.e., using shared weights for every location in... -
BIT: Bi-Level Temporal Modeling for Efficient Supervised Action Segmentation
Action segmentation dataset for supervised action segmentation