-
Microsoft COCO
The Microsoft COCO dataset was used for training and evaluating the CNNs because it has become a standard benchmark for testing algorithms aimed at scene understanding and... -
ImageNet Large Scale Visual Recognition Challenge
A benchmark for low-shot recognition was proposed by Hariharan & Girshick (2017) and consists of a representation learning phase without access to the low-shot classes and a... -
KITTI 2015
The KITTI 2015 dataset is a real-world dataset of street views, containing 200 training stereo image pairs with sparsely labeled disparity from LiDAR data. -
Scene Flow
Stereo matching aims to recover the dense reconstruction of unknown scenes by computing the disparity from rectified stereo images, helping robots intelligently interact with... -
FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters
The dataset used in this paper is a CNN training dataset, specifically VGG-16, VGG-19, and AlexNet. -
FusionT-LESS
Sensor fusion can significantly improve the performance of many computer vision tasks. However, traditional fusion approaches are either not data-driven and cannot exploit prior... -
FusionCelebA
Sensor fusion can significantly improve the performance of many computer vision tasks. However, traditional fusion approaches are either not data-driven and cannot exploit prior... -
FusionMNIST
Sensor fusion can significantly improve the performance of many computer vision tasks. However, traditional fusion approaches are either not data-driven and cannot exploit prior... -
CIFAR-100, MNIST, ImageNet, MIT67, SUN397, Places205
The dataset used in this paper for object recognition on CIFAR-100, MNIST, and ImageNet, and scene recognition on MIT67, SUN397, and Places205. -
Learning Multiple Layers of Features from Tiny Images
The CIFAR-10 dataset consists of 60,000 training images and 10,000 test images. Each image is a 32×32 color image. -
Neural 3D Video Synthesis from Multi-View Video
The DyNeRF dataset contains 3D dynamic scenes with moving or deforming objects. -
Streaming Radiance Fields for 3D Video Synthesis
The MeetRoom dataset contains 3D dynamic scenes with moving or deforming objects. -
D-NeRF: Neural Radiance Fields for Dynamic Scenes
The D-NeRF dataset contains 3D dynamic scenes with moving or deforming objects. -
CIFAR-10 and ImageNet
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used the CLIP model and the CIFAR-10 and ImageNet datasets. -
ModelNet40
Point cloud registration is a crucial problem in computer vision and robotics. Existing methods either rely on matching local geometric features, which are sensitive to the pose...