Temporal Cross-attention for Action Recognition

Feature shifts have been shown to be useful for action recognition with CNN-based models since Temporal Shift Module (TSM) was proposed. It is based on frame-wise feature extraction with late fusion, and layer features are shifted along the time direction for the temporal interaction.

Data and Resources

Cite this as

Ryota Hashiguchi, Toru Tamaki (2024). Dataset: Temporal Cross-attention for Action Recognition. https://doi.org/10.57702/zk0xxcyw

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Author Ryota Hashiguchi
More Authors
Toru Tamaki
Homepage https://arxiv.org/abs/2103.00020