Semantic Image-Text-Classes

This dataset is introduced by the paper "Understanding, Categorizing and Predicting Semantic Image-Text Relations".

If you are using this dataset it in your work, please cite:

@inproceedings{otto2019understanding, title={Understanding, Categorizing and Predicting Semantic Image-Text Relations}, author={Otto, Christian and Springstein, Matthias and Anand, Avishek and Ewerth, Ralph}, booktitle={In Proceedings of ACM International Conference on Multimedia Retrieval (ICMR 2019)}, year={2019} }

To create the full tar use the following command in the command line:

cat train.tar.part* > train_concat.tar

Then simply untar it via

tar -xf train_concat.tar

The jsonl files contain metadata of the following format:

id, origin, CMI, SC, STAT, ITClass, text, tagged text, image_path

License Information:

This dataset is composed of various open access sources as described in the paper. We thank all the original authors for their work.

BibTex: