MSRVTT-QA

Video question answering (VideoQA) requires systems to understand the visual information and infer an answer for a natural language question from it.

BibTex: