MixGen: A New Multi-Modal Data Augmentation

MixGen: a joint data augmentation for vision-language representation learning to further improve data efficiency.

BibTex: