Multi30k

The Multi30k dataset is an extension of the Flickr30k dataset, containing 29,000 train images, 1,014 validation images and 1,000 test images. Each image is accompanied with six captions in English, German, French and Czech.

Data and Resources

Cite this as

Joji Toyama, Masanori Misono, Masahiro Suzuki, Kotaro Nakayama, Yutaka Matsuo (2024). Dataset: Multi30k. https://doi.org/10.57702/53x0lebb

DOI retrieved: November 25, 2024

Additional Info

Field Value
Created November 25, 2024
Last update December 3, 2024
Defined In https://doi.org/10.48550/arXiv.1809.00151
Citation
  • https://doi.org/10.48550/arXiv.1611.08459
  • https://doi.org/10.48550/arXiv.1811.04697
Author Joji Toyama
More Authors
Masanori Misono
Masahiro Suzuki
Kotaro Nakayama
Yutaka Matsuo
Homepage https://github.com/dl4mt/dl4mt-tutorial/blob/master/docs/mult30k.md