-
Rethinking Multi-Modal Alignment in Multi-Choice VideoQA from Feature and Sam...
Reasoning about causal and temporal event relations in videos is a new destination of Video Question Answering (VideoQA). The major stumbling block to achieve this purpose is...