14 datasets found

Tags: retrieval

Filter Results
  • Image-Text Retrieval

    The dataset used in the paper for image-text retrieval.
  • Alimama retrieval dataset

    The Alimama retrieval dataset is a large-scale dataset covering daily search logs of the three scenarios: Visual Search (VS), Similar Search (SS), and Interest Search (IS) on...
  • Robust04

    The dataset used in the paper is the Robust04 dataset, a news corpus containing 0.5M documents and 249 queries.
  • MSR-VTT-CN

    Bilingual video-text retrieval dataset
  • Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning

    Cross-lingual cross-modal retrieval with noise-robust learning for low-resource languages
  • Stickers Dataset

    The image-only stickers dataset used for testing the kNN-Diffusion model.
  • Public Multimodal Dataset

    The dataset used for training the kNN-Diffusion model, which consists of a large-scale retrieval method for training a text-to-image model without any text data.
  • AudioCaps

    Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates given a query in another modality.
  • LSMDC

    The LSMDC movie description dataset consists of 118,081 short video clips extracted from 202 movies, each annotated with a single caption.
  • DeepFashion dataset

    The DeepFashion dataset is a large-scale dataset for person image synthesis, containing 101,966 pairs of images with different poses and clothing.
  • MSVD

    Text-Video Retrieval (TVR) aims to align relevant video content with natural language queries. To date, most state-of-the-art TVR methods learn image-to-video transfer learning...
  • ActivityNet Captions

    The ActivityNet Captions is a benchmark dataset proposed for dense video captioning. There are 20K untrimmed videos in total, and each video has several annotated segments with...
  • MSR-VTT

    The dataset used in the paper is MSR-VTT, a large video description dataset for bridging video and language. The dataset contains 10k video clips with length varying from 10 to...
  • COCO

    Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...
You can also access this registry using the API (see API Docs).