-
AVA Database
The AVA database contains over 250,000 images with multiple aesthetic scores from reviewers. -
YouTube-8M: A Large-Scale Video Classification Benchmark
YouTube-8M is a large-scale video classification benchmark. -
Oxford RobotCar
Oxford RobotCar dataset contains a large amount of data collected from one route through central Oxford, and covers various weather and traffic conditions. -
Cash-out fraud detection dataset
The dataset used in this paper is a large-scale dataset for automatic detection of cash-out fraud, with more than 100 million training samples. -
Open Image Dataset
The Open Image Dataset (OID) is a large-scale image dataset that contains a diverse set of images. -
Image Text Pseudo-Pose (ITPP)
A large-scale dataset of human poses and their text descriptions extracted from large-scale image datasets. -
LRS3-TED: A Large-Scale Dataset for Visual Speech Recognition
LRS3-TED: a large-scale dataset for visual speech recognition. -
How2: A large-scale dataset for multimodal language understanding
A large-scale multimodal machine translation dataset named How2, which has 1.57 times longer mean sentence length than Multi30k and no repetition. -
YFCC100M dataset
The YFCC100M dataset. -
ImageNet-2012 training set
The ImageNet-2012 training set of 1.2 million images labelled into 1,000 object categories. -
LAION-Aesthetics
The LAION-Aesthetics dataset is a large-scale dataset of images used for training and evaluating computer vision models. -
MIT-Adobe FiveK
The MIT-Adobe FiveK dataset, a large-scale dataset for image segmentation and object detection. -
Multimodal C4 (mmc4)
Multimodal C4 (mmc4) is a public, billion-scale corpus of images and text, constructed from public webpages contained in the cleaned English c4 corpus. -
DialogCC: Large-Scale Multi-Modal Dialogue Dataset
A large-scale multi-modal dialogue dataset created by leveraging the automatic pipeline with filtering using CLIP similarity. -
Open Images Dataset
The dataset used in the experiment consists of 50 images equally distributed between five classes: aircraft, bird, bicycle, boat, and dog. Each class has 5 true positive images... -
LAION-400M dataset
The LAION-400M dataset is a large collection of images and captions that represent different cultures from around the world.