i-CLEVR

i-CLEVR is a synthetic dataset generated using the CLEVR engine. Each scene contains five image-instruction pairs. Starting from a background image, new objects are added sequentially in a scene.

BibTex: