Temporal Sentence Grounding in Videos

Temporal sentence grounding in videos (TSGV) is a task to retrieve a video segment that semantically corresponds to a query in natural language.

BibTex: