RefCOCO, RefCOCO+, and RefCOCOg

Visual Grounding is a task that aims to locate a target object according to a natural language expression. The dataset used in this paper is RefCOCO, RefCOCO+, and RefCOCOg.

Data and Resources

Cite this as

Yucheng Suo, Linchao Zhu, Yi Yang (2024). Dataset: RefCOCO, RefCOCO+, and RefCOCOg. https://doi.org/10.57702/a6ushrmc

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Defined In https://doi.org/10.48550/arXiv.2310.18049
Citation
  • https://doi.org/10.48550/arXiv.2206.09114
Author Yucheng Suo
More Authors
Linchao Zhu
Yi Yang
Homepage https://arxiv.org/abs/2304.02643