Video-Specific Query-Key Attention Modeling for Weakly-Supervised Temporal Action Localization
Weakly-supervised temporal action localization aims to identify and localize the action instances in the untrimmed videos with only video-level action labels.
BibTex: