CLEVR-CoGenT

The CLEVR-CoGenT dataset is a dataset for visual question answering, where the questions consist on comparing the position of two objects.

BibTex: