General Visual Question Answering

The GQA dataset is a large-scale visual semantic graph dataset, containing 8,208 images annotated with object and predicate labels to form a scene graph.

BibTex: