Agqa: A Benchmark for Compositional Spatio-Temporal Reasoning

The AGQA benchmark is a visual dataset comprising 192M hand-crafted questions about 9.6K videos from the Charades dataset.

BibTex: