ActivityNet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos

Contextual reasoning is essential to understand events in long untrimmed videos. In this work, we systematically explore different captioning models with various contexts for the dense-captioning events in video task, which aims to generate captions for different events in the untrimmed video.

Data and Resources

Cite this as

Shizhe Chen, Yuqing Song, Yida Zhao, Qin Jin, Zhaoyang Zeng, Bei Liu, Jianlong Fu, Alexander Hauptmann (2024). Dataset: ActivityNet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos. https://doi.org/10.57702/qmxe5ovb

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.1907.05092
Author Shizhe Chen
More Authors
Yuqing Song
Yida Zhao
Qin Jin
Zhaoyang Zeng
Bei Liu
Jianlong Fu
Alexander Hauptmann
Homepage https://arxiv.org/abs/1806.08854