-
Dense regression network for video grounding
Dense regression network for video grounding -
Semantic conditioned dynamic modulation for temporal sentence grounding in vi...
Semantic conditioned dynamic modulation for temporal sentence grounding in videos -
Multilevel language and vision integration for text-to-clip retrieval
Multilevel language and vision integration for text-to-clip retrieval -
Tall: Temporal activity localization via language query
Tall: Temporal activity localization via language query. -
Support-Set Based Cross-Supervision for Video Grounding
Support-Set Based Cross-Supervision for Video Grounding -
Localizing moments in video with natural language
Localizing moments in video with natural language