Audio-Visual Question Answering

doi:doi:10.57702/266xhsry

Audio-Visual Question Answering

Audio-visual question answering (AVQA) requires reference to video content and auditory information, followed by correlating the question to predict the most precise answer.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Qilang Ye, Zitong Yu, Xin Liu (2024). Dataset: Audio-Visual Question Answering. https://doi.org/10.57702/266xhsry

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Author	Qilang Ye
More Authors	Zitong Yu Xin Liu
Homepage	https://github.com/rikeilong/MCD-forAVQA