-
Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset o...
Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models -
ImageNet-1K, Food-101, Birds, and Dogs datasets
The dataset used for image classification tasks, including ImageNet-1K, Food-101, Birds, and Dogs. -
Hierarchical Dynamic Image Harmonization
Image harmonization is a critical task in computer vision, which aims to adjust the foreground to make it compatible with the background. -
Shadow Detection Datasets
The dataset used in this paper for shadow detection, consisting of 4 widely used benchmark datasets: SBU, UCF, ISTD, and CUHK. -
Image Enhancement for Adverse Images
This paper uses the ImageNet and COCO2017 validation datasets for testing. -
YCB-Video dataset
The YCB-Video dataset contains 92 videos of 21 objects with varying textures and sizes under cluttered indoor environments. -
LINEMOD-OCCLUSION dataset
The LMO dataset is a subset of the LM dataset consisting of eight objects in more cluttered scenes. -
LINEMOD dataset
The LM consists of 13 objects with approximately 1.2K images per object. We follow the settings described in [2], which uses 15% of the data for training and the rest for testing. -
Fashion-MNIST, MNIST, SVHN, dSprites, and CIFAR-10
The dataset used in the paper is Fashion-MNIST, MNIST, SVHN, dSprites, and CIFAR-10. -
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Text...
Text-driven 3D stylization is a complex and crucial task in the fields of computer vision (CV) and computer graphics (CG), aimed at transforming a bare mesh to fit a target text. -
Compact Transformer Tracker with Correlative Masked Modeling
Visual Object Tracking is one of the fundamental tasks in computer vision with applications ranging from human-computer interaction, surveillance, traffic flow monitoring and etc.... -
Sliding Window ConvNet dataset
The dataset used in this paper is a sliding window ConvNet dataset, which is a collection of 3D images and their corresponding labels. -
3D ConvNet dataset
The dataset used in this paper is a 3D ConvNet dataset, which is a collection of 3D images and their corresponding labels. -
MIT Indoor Scene Recognition
The MIT Indoor Scene Recognition dataset contains 67 categories of indoor scenes. -
FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
A Full Flow Bidirectional Fusion Network for 6D Pose Estimation from a single RGBD image -
Occluded CIFAR
The dataset used in the paper is Occluded CIFAR. -
Cluttered MNIST and CIFAR-10
The dataset used in the paper is Cluttered MNIST and CIFAR-10. -
ImageNet-32
The ImageNet-32 dataset is a subset of the ImageNet dataset, containing 1,281,167 training samples and 50,000 test samples, distributed across 1,000 labels. -
CIFAR10 and ImageNet
The dataset used in the paper to evaluate the alignment of deep neural networks with human perception.