Stacked Cross Attention

The dataset used in the paper is Stacked Cross Attention for Image-Text Matching.

BibTex: