Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization

doi:doi:10.57702/q8zmgrmi

You're currently viewing an old version of this dataset. To see the current version, click here.

Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization

Temporal action localization is a task to localize the start and end timestamps of action instances and recognize their categories. In recent years, many works put effort into the fully supervised manner and gain great achievements. However, these fully supervised methods require extensive manual frame/snippet level annotations. To address this problem, many weakly supervised temporal action localization (WS-TAL) methods are proposed to explore an efficient way to detect the action instances in the given videos with only video-level supervision which is more easily obtained by the annotator.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Fa-Ting Hong, Jia-Chang Feng, Dan Xu, Ying Shan, Wei-Shi Zheng (2024). Dataset: Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization. https://doi.org/10.57702/q8zmgrmi

DOI retrieved: December 16, 2024

Additional Info

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Defined In	https://doi.org/10.48550/arXiv.2107.12589
Author	Fa-Ting Hong
More Authors	Jia-Chang Feng Dan Xu Ying Shan Wei-Shi Zheng
Homepage	https://doi.org/10.1145/3474085.3475298