Audio-Visual Question Answering

Audio-visual question answering (AVQA) requires reference to video content and auditory information, followed by correlating the question to predict the most precise answer.

Data and Resources

Cite this as

Qilang Ye, Zitong Yu, Xin Liu (2024). Dataset: Audio-Visual Question Answering. https://doi.org/10.57702/266xhsry

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Author Qilang Ye
More Authors
Zitong Yu
Xin Liu
Homepage https://github.com/rikeilong/MCD-forAVQA