VILLA

The dataset used in the paper for vision-and-language representation learning.

Data and Resources

Cite this as

Zhe Gan, Yen-Chun Chen, Linjie Li, Chen Zhu, Yu Cheng, Jingjing Liu (2024). Dataset: VILLA. https://doi.org/10.57702/1i9ozblf

DOI retrieved: December 17, 2024

Additional Info

Field Value
Created December 17, 2024
Last update December 17, 2024
Defined In https://doi.org/10.48550/arXiv.2006.06195
Author Zhe Gan
More Authors
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
Homepage https://github.com/zhegan27/VILLA