Cite this as

Yunhao Gou, Tom Ko, Hansi Yang, Mingxuan Wang, James Kwok, Yu Zhang (2024). Dataset: EPIC: Leveraging Per Image-Token Consistency for Vision-Language Pre-training. Resource: Original Metadata. https://doi.org/10.57702/4s3ecd7l

DOI retrieved: December 2, 2024

Additional Information

Field Value
Created December 2, 2024
Last updated December 2, 2024
Format JSON