-
Image-Text Retrieval
The dataset used in the paper for image-text retrieval. -
Alimama retrieval dataset
The Alimama retrieval dataset is a large-scale dataset covering daily search logs of the three scenarios: Visual Search (VS), Similar Search (SS), and Interest Search (IS) on... -
MSR-VTT-CN
Bilingual video-text retrieval dataset -
Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning
Cross-lingual cross-modal retrieval with noise-robust learning for low-resource languages -
Stickers Dataset
The image-only stickers dataset used for testing the kNN-Diffusion model. -
Public Multimodal Dataset
The dataset used for training the kNN-Diffusion model, which consists of a large-scale retrieval method for training a text-to-image model without any text data. -
DeepFashion dataset
The DeepFashion dataset is a large-scale dataset for person image synthesis, containing 101,966 pairs of images with different poses and clothing. -
ActivityNet Captions
The ActivityNet Captions is a benchmark dataset proposed for dense video captioning. There are 20K untrimmed videos in total, and each video has several annotated segments with...