Object Attribute Matters in Visual Question Answering

Visual question answering is a multimodal task that requires the joint comprehension of visual and textual information. The proposed approach utilizes object attributes to explicitly align visual and linguistic semantics.

Data and Resources

Cite this as

Peize Li, Qingyi Si, Peng Fu, Zheng Lin, Yan Wang (2024). Dataset: Object Attribute Matters in Visual Question Answering. https://doi.org/10.57702/asdzwf73

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Author Peize Li
More Authors
Qingyi Si
Peng Fu
Zheng Lin
Yan Wang
Homepage https://arxiv.org/abs/2104.12345