Vision-and-Language Navigation

The Vision-and-Language Navigation (VLN) task gives a global natural sentence I = {w0,..., wl} as an instruction, where wi is a word token while the l is the length of the sentence.

Data and Resources

Cite this as

Peter Anderson, Angel X. Chang, Devendra Singh Chaplot, Alexey Dosovitsky, Saurabh Gupta, Jitendra Ma-Vladlen Koltun, Manolis Savva, Amir Roshan Zamir (2024). Dataset: Vision-and-Language Navigation. https://doi.org/10.57702/ntvoct4a

DOI retrieved: December 3, 2024

Additional Info

Field Value
Created December 3, 2024
Last update December 3, 2024
Defined In https://doi.org/10.48550/arXiv.2308.03244
Citation
  • https://doi.org/10.48550/arXiv.2203.04006
  • https://doi.org/10.48550/arXiv.2210.10020
Author Peter Anderson
More Authors
Angel X. Chang
Devendra Singh Chaplot
Alexey Dosovitsky
Saurabh Gupta
Jitendra Ma-Vladlen Koltun
Manolis Savva
Amir Roshan Zamir
Homepage https://arxiv.org/abs/1807.06757