Language-free Training for Zero-shot Video Grounding

Given an untrimmed video and a language query, video grounding aims to localize the time interval by understanding the text and video simultaneously.

BibTex: