You're currently viewing an old version of this dataset. To see the current version, click here.

Room-to-Room (R2R) dataset

The Room-to-Room (R2R) dataset is a benchmark for vision-and-language navigation tasks. It consists of 7,189 paths sampled from its navigation graphs, each with three ground-truth navigation instructions written by humans.

Data and Resources

Cite this as

Chih-Yao Ma, Jiasen Lu, Zuxuan Wu, Ghassan AlRegib, Zsolt Kira, Richard Socher, Caiming Xiong (2024). Dataset: Room-to-Room (R2R) dataset. https://doi.org/10.57702/e4wqnskj

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.1803.07729
Citation
  • https://doi.org/10.48550/arXiv.2403.03405
  • https://doi.org/10.48550/arXiv.2110.05728
  • https://doi.org/10.48550/arXiv.1901.03035
  • https://doi.org/10.48550/arXiv.1911.07883
  • https://doi.org/10.48550/arXiv.1711.07280
  • https://doi.org/10.48550/arXiv.2107.07201
  • https://doi.org/10.48550/arXiv.2308.03244
Author Chih-Yao Ma
More Authors
Jiasen Lu
Zuxuan Wu
Ghassan AlRegib
Zsolt Kira
Richard Socher
Caiming Xiong
Homepage https://arxiv.org/abs/1807.06757