Augmented 2D-TAN: A Two-stage Approach for Human-centric Spatio-Temporal Video Grounding

Human-centric spatio-temporal video grounding (HC-STVG) task aims to localize a spatio-temporal tube of the target person indicated by a language description.

Data and Resources

Cite this as

Chaolei Tan, Zihang Lin, Jian-Fang Hu, Xiang Li, Wei-Shi Zheng (2024). Dataset: Augmented 2D-TAN: A Two-stage Approach for Human-centric Spatio-Temporal Video Grounding. https://doi.org/10.57702/0vperfpg

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Defined In https://doi.org/10.48550/arXiv.2106.10634
Author Chaolei Tan
More Authors
Zihang Lin
Jian-Fang Hu
Xiang Li
Wei-Shi Zheng
Homepage https://www.picdataset.com/challenge/leaderboard/hcvg2021