Cite this as

T. Wang, W. Jiang, Z. Lu, F. Zheng, R. Cheng, C. Yin, P. Luo (2024). Dataset: VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix. Resource: Original Metadata. https://doi.org/10.57702/nv69pofd

DOI retrieved: December 16, 2024

Additional Information

Field Value
Created December 16, 2024
Last updated December 16, 2024
Format JSON