1,082 datasets found

Groups: Computer Vision

Filter Results
  • Microsoft COCO

    The Microsoft COCO dataset was used for training and evaluating the CNNs because it has become a standard benchmark for testing algorithms aimed at scene understanding and...
  • ImageNet Large Scale Visual Recognition Challenge

    A benchmark for low-shot recognition was proposed by Hariharan & Girshick (2017) and consists of a representation learning phase without access to the low-shot classes and a...
  • KITTI 2015

    The KITTI 2015 dataset is a real-world dataset of street views, containing 200 training stereo image pairs with sparsely labeled disparity from LiDAR data.
  • Scene Flow

    Stereo matching aims to recover the dense reconstruction of unknown scenes by computing the disparity from rectified stereo images, helping robots intelligently interact with...
  • FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters

    The dataset used in this paper is a CNN training dataset, specifically VGG-16, VGG-19, and AlexNet.
  • Pokémon

    Pokémon
  • FFHQ

    Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...
  • FusionT-LESS

    Sensor fusion can significantly improve the performance of many computer vision tasks. However, traditional fusion approaches are either not data-driven and cannot exploit prior...
  • FusionCelebA

    Sensor fusion can significantly improve the performance of many computer vision tasks. However, traditional fusion approaches are either not data-driven and cannot exploit prior...
  • FusionMNIST

    Sensor fusion can significantly improve the performance of many computer vision tasks. However, traditional fusion approaches are either not data-driven and cannot exploit prior...
  • CIFAR-100, MNIST, ImageNet, MIT67, SUN397, Places205

    The dataset used in this paper for object recognition on CIFAR-100, MNIST, and ImageNet, and scene recognition on MIT67, SUN397, and Places205.
  • Learning Multiple Layers of Features from Tiny Images

    The CIFAR-10 dataset consists of 60,000 training images and 10,000 test images. Each image is a 32×32 color image.
  • Neural 3D Video Synthesis from Multi-View Video

    The DyNeRF dataset contains 3D dynamic scenes with moving or deforming objects.
  • Streaming Radiance Fields for 3D Video Synthesis

    The MeetRoom dataset contains 3D dynamic scenes with moving or deforming objects.
  • D-NeRF: Neural Radiance Fields for Dynamic Scenes

    The D-NeRF dataset contains 3D dynamic scenes with moving or deforming objects.
  • DressCode

    DressCode dataset for multi-category virtual try-on, aiming to transfer an in-shop garment onto a specific person from different categories.
  • VITON-HD

    Virtual try-on focuses on adjusting the given clothes to fit a specific person seamlessly while avoiding any distortion of the patterns and textures of the garment. The clothing...
  • CIFAR-10 and ImageNet

    The dataset used in the paper is not explicitly described, but it is mentioned that the authors used the CLIP model and the CIFAR-10 and ImageNet datasets.
  • COCO

    Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...
  • ModelNet40

    Point cloud registration is a crucial problem in computer vision and robotics. Existing methods either rely on matching local geometric features, which are sensitive to the pose...