Dataset - LDM

PPR10K: A Large-Scale Portrait Photo Retouching Dataset

PPR10K: A large-scale portrait photo retouching dataset with human-region mask and group-level consistency.
- Dataset
- JSON
PIE: A Large-Scale Dataset and Models for Pedestrian Intention Estimation and...

A large-scale dataset and models for pedestrian intention estimation and trajectory prediction.
- Dataset
- JSON
AVA Database

The AVA database contains over 250,000 images with multiple aesthetic scores from reviewers.
- Dataset
- JSON
YouTube-8M: A Large-Scale Video Classification Benchmark

YouTube-8M is a large-scale video classification benchmark.
- Dataset
- JSON
Oxford RobotCar

Oxford RobotCar dataset contains a large amount of data collected from one route through central Oxford, and covers various weather and traffic conditions.
- Dataset
- JSON
Cash-out fraud detection dataset

The dataset used in this paper is a large-scale dataset for automatic detection of cash-out fraud, with more than 100 million training samples.
- Dataset
- JSON
Open Image Dataset

The Open Image Dataset (OID) is a large-scale image dataset that contains a diverse set of images.
- Dataset
- JSON
TACRED

The dataset used in the paper is not explicitly described, but it is mentioned that the authors used a few-shot relation extraction task (TACRED) and a few-shot variant of TACRED.
- Dataset
- JSON
FFIW10K

A novel large-scale dataset for face forgery detection in multi-person scenarios, comprising 10,000 high-quality forgery videos with an average of three human faces in each frame.
- Dataset
- JSON
N-UCLA

N-UCLA dataset is a widely used skeleton-based action recognition dataset, containing 1494 video clips featuring 10 volunteers.
- Dataset
- JSON
Image Text Pseudo-Pose (ITPP)

A large-scale dataset of human poses and their text descriptions extracted from large-scale image datasets.
- Dataset
- JSON
SoundNet

The dataset is used for learning general and effective models for both audio and video analysis from self-supervised temporal synchronization.
- Dataset
- JSON
LRS3-TED: A Large-Scale Dataset for Visual Speech Recognition

LRS3-TED: a large-scale dataset for visual speech recognition.
- Dataset
- JSON
How2: A large-scale dataset for multimodal language understanding

A large-scale multimodal machine translation dataset named How2, which has 1.57 times longer mean sentence length than Multi30k and no repetition.
- Dataset
- JSON
YFCC100M dataset

The YFCC100M dataset.
- Dataset
- JSON
ImageNet-2012 training set

The ImageNet-2012 training set of 1.2 million images labelled into 1,000 object categories.
- Dataset
- JSON
LAION-Aesthetics

The LAION-Aesthetics dataset is a large-scale dataset of images used for training and evaluating computer vision models.
- Dataset
- JSON
MIT-Adobe FiveK

The MIT-Adobe FiveK dataset, a large-scale dataset for image segmentation and object detection.
- Dataset
- JSON
Multimodal C4 (mmc4)

Multimodal C4 (mmc4) is a public, billion-scale corpus of images and text, constructed from public webpages contained in the cleaned English c4 corpus.
- Dataset
- JSON
DialogCC: Large-Scale Multi-Modal Dialogue Dataset

A large-scale multi-modal dialogue dataset created by leveraging the automatic pipeline with filtering using CLIP similarity.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

76 datasets found