COCO 5K

The dataset used in the paper for unpaired vision-language pre-training via cross-modal CutMix.

Data and Resources

Cite this as

Teng Wang, Wenhao Jiang, Zhichao Lu, Feng Zheng, Ran Cheng, Chengguo Yin, Ping Luo (2024). Dataset: COCO 5K. https://doi.org/10.57702/b0ojmz96

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Defined In https://doi.org/10.48550/arXiv.2206.08919
Author Teng Wang
More Authors
Wenhao Jiang
Zhichao Lu
Feng Zheng
Ran Cheng
Chengguo Yin
Ping Luo
Homepage https://github.com/ttengwang/VLMixer