ActivityNet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos

doi:doi:10.57702/qmxe5ovb

ActivityNet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos

Contextual reasoning is essential to understand events in long untrimmed videos. In this work, we systematically explore different captioning models with various contexts for the dense-captioning events in video task, which aims to generate captions for different events in the untrimmed video.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Shizhe Chen, Yuqing Song, Yida Zhao, Qin Jin, Zhaoyang Zeng, Bei Liu, Jianlong Fu, Alexander Hauptmann (2024). Dataset: ActivityNet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos. https://doi.org/10.57702/qmxe5ovb

DOI retrieved: December 16, 2024

Additional Info

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Defined In	https://doi.org/10.48550/arXiv.1907.05092
Author	Shizhe Chen
More Authors	Yuqing Song Yida Zhao Qin Jin Zhaoyang Zeng Bei Liu Jianlong Fu Alexander Hauptmann
Homepage	https://arxiv.org/abs/1806.08854