VQA

The VQA dataset is a large-scale visual question answering dataset that consists of pairs of images that require natural language answers.

Data and Resources

Cite this as

Yash Goyal, Tejas Khot, Douglas Summers-Stay, Dhruv Batra, Devi Parikh (2024). Dataset: VQA. https://doi.org/10.57702/fjcazpne

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.1906.03952
Citation
  • https://doi.org/10.48550/arXiv.1806.00857
  • https://doi.org/10.48550/arXiv.2001.03615
  • https://doi.org/10.48550/arXiv.2007.04422
  • https://doi.org/10.1007/s11263-019-01228-7
  • https://doi.org/10.48550/arXiv.2006.06195
  • https://doi.org/10.48550/arXiv.1710.03370
Author Yash Goyal
More Authors
Tejas Khot
Douglas Summers-Stay
Dhruv Batra
Devi Parikh
Homepage https://vqa.s3-website-us-east-1.amazonaws.com/