-
nuScenes Scene Flow
Self-driving dataset for scene flow estimation -
Argoverse Scene Flow
Self-driving dataset for scene flow estimation -
KITTI Scene Flow
Self-driving dataset for scene flow estimation -
3D-R2N2: A Unified Approach for Single and Multi-View 3D Object Reconstruction
3D-R2N2: A Unified Approach for Single and Multi-View 3D Object Reconstruction -
V2R Mirrors or Windows
The dataset is used to demonstrate a new interaction paradigm in virtual reality environments, which consists of a virtual mirror or window projected onto a virtual surface. -
MNIST and notMNIST datasets
The MNIST dataset is used for in-distribution confusing examples and notMNIST dataset is used for out-of-distribution data. -
CAMELYON-16
The CAMELYON-16 dataset is a public dataset for whole slide image analysis, containing 16,000 whole slide images of breast cancer histopathology slides. -
JAFFE Database
The JAFFE database consists of 213 images from 10 Japanese female subjects. For each subject, there are around 4 images for each of the seven expressions (including neutral). -
Perspective Transformer Nets
The dataset used in this paper for 3D mesh reconstruction from a single image. -
Unit Sphere
The dataset used in the paper is a collection of data points from the unit sphere. -
CIFAR-10, CIFAR-100, and TinyImageNet
The dataset used in the paper is CIFAR-10, CIFAR-100, and TinyImageNet. -
ScanNet200
Diff2Scene uses ScanNet, Matterport3D, ScanNet200 and Replica for open-vocabulary 3D semantic segmentation and visual grounding tasks. -
Depth Map Super-Resolution
Depth map super-resolution (DMSR) is a practical and valuable computer vision task. DMSR requires upscaling a low-resolution (LR) depth map into a high-resolution (HR) space. -
GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI ...
Human-Object Interaction (HOI) detection is a significant task to make a machine understand human activities in a static image at a fine-grained level. -
CIFAR10, CIFAR100, SVHN, ImageNet
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used four widely used datasets: CIFAR10, CIFAR100, SVHN, and ImageNet. -
Perspective Crop Layers (PCLs)
Local processing is an essential feature of CNNs and other neural network architectures—it is one of the reasons why they work so well on images where relevant information is,... -
PASCAL VOC Dataset
The PASCAL VOC dataset contains 20 classes, including person, animal, vehicle, and indoor, with 9,963 images containing 24,640 annotated objects. -
Microsoft Bing Images API
Aerial images of London.