Dataset - LDM

QMUL-Chair-V2

Fine-grained sketch-based image retrieval (FG-SBIR) aims to minimize the distance between sketches and corresponding images in the embedding space.
- Dataset
- JSON
Clothes-V1

Fine-grained sketch-based image retrieval (FG-SBIR) aims to minimize the distance between sketches and corresponding images in the embedding space.
- Dataset
- JSON
Products-10K

The Products-10K dataset is a large-scale image retrieval dataset, containing images of products from an e-commerce website.
- Dataset
- JSON
Google Landmarks 2020 Dataset

The Google Landmarks 2020 Dataset is a large-scale image retrieval dataset, containing images of landmarks from around the world.
- Dataset
- JSON
GUIE Challenge

The Google Universal Image Embedding (GUIE) Challenge dataset is a large-scale image retrieval dataset, covering a wide distribution of objects: landmarks, artwork, food, etc.
- Dataset
- JSON
Shoes

The dataset used in the paper is the Shoes dataset, which consists of c.50,000 examples of shoes in RGB color, from 4 different categories and over 3000 different subcategories.
- Dataset
- JSON
Visual Concept Search

A dataset for visual concept search, where the goal is to identify images containing relevant content based on visual concepts.
- Dataset
- JSON
COCO 5K

The dataset used in the paper for unpaired vision-language pre-training via cross-modal CutMix.
- Dataset
- JSON
Cambridge Landmarks

The Cambridge Landmarks dataset contains 5 different large outdoor scenes of landmarks in the city of Cambridge.
- Dataset
- JSON
ZeroSearch Dataset

A custom dataset to simulate a user's image directory for testing the ZeroSearch algorithm.
- Dataset
- JSON
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval

Frozen in time: A joint video and image encoder for end-to-end retrieval.
- Dataset
- JSON
Birds-to-Words

The Birds-to-Words dataset contains 15,931 images (12,770 training and 3,151 testing) tagged with descriptions of fine-grained differences between pairwise bird images.
- Dataset
- JSON
CIRR

CIRR is a general image dataset that comprises 36,554 triplets derived from 21,552 images from the popular natural language inference dataset NLVR2.
- Dataset
- JSON
FashionIQ

The FashionIQ dataset contains images of fashion products over 3 categories: Dress, Toptee, and Shirt, with 46,609 images in the training and 31,075 images in the validation set.
- Dataset
- JSON
NUS-WIDE

The dataset used in the paper is a multi-view clustering dataset, which contains 6 views of 30000 samples each. The dataset is used to evaluate the performance of the proposed...
- Dataset
- JSON
Flickr30k

The Flickr30k dataset is widely utilized for image caption and image-text retrieval tasks, providing a substantial collection of images with associated captions.
- Dataset
- JSON
Oxford5k and Paris6k

Oxford5k and Paris6k are large-scale image retrieval datasets.
- Dataset
- JSON
CUB200-2011

The dataset used in the paper is CUB200-2011, a fine-grained image classification dataset.
- Dataset
- JSON
LabelMe dataset

The LabelMe dataset is a natural scene dataset used for testing the performance of the IBTM model on image classification tasks.
- Dataset
- JSON
YFCC100M

The dataset used in the paper is YFCC100M, a large-scale video dataset. The dataset is used for foreground and background patch extraction and object recognition tasks.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

43 datasets found