Vision-and-Language Navigation

The Vision-and-Language Navigation (VLN) task gives a global natural sentence I = {w0,..., wl} as an instruction, where wi is a word token while the l is the length of the sentence.

BibTex: