Vision-and-Language Navigation

doi:doi:10.57702/ntvoct4a

Vision-and-Language Navigation

The Vision-and-Language Navigation (VLN) task gives a global natural sentence I = {w0,..., wl} as an instruction, where wi is a word token while the l is the length of the sentence.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Peter Anderson, Angel X. Chang, Devendra Singh Chaplot, Alexey Dosovitsky, Saurabh Gupta, Jitendra Ma-Vladlen Koltun, Manolis Savva, Amir Roshan Zamir (2024). Dataset: Vision-and-Language Navigation. https://doi.org/10.57702/ntvoct4a

DOI retrieved: December 3, 2024

Additional Info

Field	Value
Created	December 3, 2024
Last update	December 3, 2024
Defined In	https://doi.org/10.48550/arXiv.2308.03244
Citation	https://doi.org/10.48550/arXiv.2203.04006 https://doi.org/10.48550/arXiv.2210.10020
Author	Peter Anderson
More Authors	Angel X. Chang Devendra Singh Chaplot Alexey Dosovitsky Saurabh Gupta Jitendra Ma-Vladlen Koltun Manolis Savva Amir Roshan Zamir
Homepage	https://arxiv.org/abs/1807.06757