Zero-Shot Temporal Action Detection via Vision-Language Prompting

Zero-Shot Temporal Action Detection via Vision-Language Prompting (STALE) model for the under-studied yet practically useful zero-shot temporal action detection (ZS-TAD)

BibTex: