COCO 5K

The dataset used in the paper for unpaired vision-language pre-training via cross-modal CutMix.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Teng Wang, Wenhao Jiang, Zhichao Lu, Feng Zheng, Ran Cheng, Chengguo Yin, Ping Luo (2024). Dataset: COCO 5K. https://doi.org/10.57702/b0ojmz96

DOI retrieved: December 2, 2024

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Defined In	https://doi.org/10.48550/arXiv.2206.08919
Author	Teng Wang
More Authors	Wenhao Jiang Zhichao Lu Feng Zheng Ran Cheng Chengguo Yin Ping Luo
Homepage	https://github.com/ttengwang/VLMixer