-
PPR10K: A Large-Scale Portrait Photo Retouching Dataset
PPR10K: A large-scale portrait photo retouching dataset with human-region mask and group-level consistency. -
PIE: A Large-Scale Dataset and Models for Pedestrian Intention Estimation and...
A large-scale dataset and models for pedestrian intention estimation and trajectory prediction. -
AVA Database
The AVA database contains over 250,000 images with multiple aesthetic scores from reviewers. -
YouTube-8M: A Large-Scale Video Classification Benchmark
YouTube-8M is a large-scale video classification benchmark. -
Oxford RobotCar
Oxford RobotCar dataset contains a large amount of data collected from one route through central Oxford, and covers various weather and traffic conditions. -
Cash-out fraud detection dataset
The dataset used in this paper is a large-scale dataset for automatic detection of cash-out fraud, with more than 100 million training samples. -
Open Image Dataset
The Open Image Dataset (OID) is a large-scale image dataset that contains a diverse set of images. -
Image Text Pseudo-Pose (ITPP)
A large-scale dataset of human poses and their text descriptions extracted from large-scale image datasets. -
LRS3-TED: A Large-Scale Dataset for Visual Speech Recognition
LRS3-TED: a large-scale dataset for visual speech recognition. -
How2: A large-scale dataset for multimodal language understanding
A large-scale multimodal machine translation dataset named How2, which has 1.57 times longer mean sentence length than Multi30k and no repetition. -
YFCC100M dataset
The YFCC100M dataset. -
ImageNet-2012 training set
The ImageNet-2012 training set of 1.2 million images labelled into 1,000 object categories. -
LAION-Aesthetics
The LAION-Aesthetics dataset is a large-scale dataset of images used for training and evaluating computer vision models. -
MIT-Adobe FiveK
The MIT-Adobe FiveK dataset, a large-scale dataset for image segmentation and object detection. -
Multimodal C4 (mmc4)
Multimodal C4 (mmc4) is a public, billion-scale corpus of images and text, constructed from public webpages contained in the cleaned English c4 corpus. -
DialogCC: Large-Scale Multi-Modal Dialogue Dataset
A large-scale multi-modal dialogue dataset created by leveraging the automatic pipeline with filtering using CLIP similarity.