Visual Question Answering (VQA)

doi:doi:10.57702/qco2vve1

Visual Question Answering (VQA)

The VQA dataset consists of 248,349 training questions, 121,512 validation questions and 244,302 testing questions, generated on a total of 123,287 images.

BibTex:

@dataset{Antol_et_al_and_Zhou_et_al_and_Noh_et_al_and_Ren_et_al_and_Li_et_al_and_Kim_et_al_and_Xu_et_al_and_Yang_et_al_and_Ye_et_al_and_Ben-Younes_et_al_and_Ilievski_et_al_and_Wu_et_al_and_Xie_et_al_and_Kiros_et_al_and_He_et_al_and_Shih_et_al_and_Wang_et_al_and_Zitnick_et_al_2025,
    abstract = {The VQA dataset consists of 248,349 training questions, 121,512 validation questions and 244,302 testing questions, generated on a total of 123,287 images.},
    author = {Antol et al. and Zhou et al. and Noh et al. and Ren et al. and Li et al. and Kim et al. and Xu et al. and Yang et al. and Ye et al. and Ben-Younes et al. and Ilievski et al. and Wu et al. and Xie et al. and Kiros et al. and He et al. and Shih et al. and Wang et al. and Zitnick et al.},
    doi = {10.57702/qco2vve1},
    institution = {No Organization},
    keyword = {'Image', 'Question Answering', 'Text'},
    month = {jan},
    publisher = {TIB},
    title = {Visual Question Answering (VQA)},
    url = {https://service.tib.eu/ldmservice/dataset/visual-question-answering--vqa-},
    year = {2025}
}