Multi30k

doi:doi:10.57702/53x0lebb

Multi30k

The Multi30k dataset is an extension of the Flickr30k dataset, containing 29,000 train images, 1,014 validation images and 1,000 test images. Each image is accompanied with six captions in English, German, French and Czech.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Joji Toyama, Masanori Misono, Masahiro Suzuki, Kotaro Nakayama, Yutaka Matsuo (2024). Dataset: Multi30k. https://doi.org/10.57702/53x0lebb

DOI retrieved: November 25, 2024

Additional Info

Field	Value
Created	November 25, 2024
Last update	December 3, 2024
Defined In	https://doi.org/10.48550/arXiv.1809.00151
Citation	https://doi.org/10.48550/arXiv.1611.08459 https://doi.org/10.48550/arXiv.1811.04697
Author	Joji Toyama
More Authors	Masanori Misono Masahiro Suzuki Kotaro Nakayama Yutaka Matsuo
Homepage	https://github.com/dl4mt/dl4mt-tutorial/blob/master/docs/mult30k.md