2 datasets found

Tags: image-text matching

Filter Results
  • RefCOCO+ and RefCOCOg

    The RefCOCO+ and RefCOCOg datasets are benchmarks for referring expression comprehension. They contain images of objects and natural language descriptions of the objects.
  • RefCOCOg

    The RefCOCOg dataset is a reconstructed dataset of the MS-COCO dataset, containing 85,474 referring expressions for 54,822 objects in 26,711 images.