992 datasets found

Groups: Computer Vision

Filter Results
  • Pascal VOC

    Semantic segmentation is a crucial and challenging task for image understanding. It aims to predict a dense labeling map for the input image, which assigns each pixel a unique...
  • YCB-Video

    The dataset used for 6D object pose estimation, consisting of images of 16 objects with varying levels of occlusion.
  • Objaverse

    The Objaverse dataset contains around 800k 3D objects. After adopting simple filter leveraging CLIP [27] to remove the objects whose rendered images are not relevant to its...
  • Various Datasets

    The datasets used in the paper are described as follows: WikiMIA, BookMIA, Temporal Wiki, Temporal arXiv, ArXiv-1 month, Multi-Webdata, LAION-MI, Gutenberg.
  • Inception Transformer

    Recent studies show that Transformer has strong capability of building long-range dependencies, yet is incompetent in capturing high frequencies that predomi- nantly convey...
  • MOCA: Masked Online Codebook Assignments prediction

    Self-supervised representation learning for Vision Transformers (ViT) to mitigate the greedy needs of ViT networks for very large fully-annotated datasets.
  • ShapeNetPart

    The dataset used in the paper is ShapeNetPart, a synthetic dataset for 3D object part segmentation. It contains 16,881 models from 16 categories.
  • DTU MVS

    Large scale multi-view stereopsis evaluation dataset.
  • PCB Component Detection using Computer Vision for Hardware Assurance

    The dataset used in this study is a semantic PCB image dataset, which contains images of PCBs with annotated components.
  • CIFAR10 and CIFAR100

    The dataset used in the paper is not explicitly described, but it is mentioned that the authors conducted experiments on various vision tasks, including image classification,...
  • N-object dataset testing

    An N-object dataset for testing the proposed framework
  • Synthetic-NSVF

    The dataset used in the paper SpikingNeRF: Making Bio-inspired Neural Networks See through the Real World
  • Synthetic-NeRF

    The dataset used in the paper SpikingNeRF: Making Bio-inspired Neural Networks See through the Real World
  • Compute trends across three eras of machine learning

    A dataset of 650 machine learning models presented in academic publications and relevant gray literature.
  • Multiscale Vision Transformers

    Multiscale Vision Transformers (MViT) for video and image recognition, by connecting the seminal idea of multiscale feature hierarchies with transformer models.
  • CLEVR

    CLEVR images contain objects characterized by a set of attributes (shape, color, size and material). The questions are grouped into 5 categories: Exist, Count, CompareInteger,...
  • SUN RGB-D

    RGB-D scene recognition approaches often train two standalone backbones for RGB and depth modalities with the same Places or ImageNet pre-training. However, the pre-trained...
  • DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training

    We propose DisCo-CLIP, a distributed memory-efficient CLIP training approach, to reduce the memory consump- tion of contrastive loss when training contrastive learning models.
  • ImageNet Dataset

    Object recognition is arguably the most important problem at the heart of computer vision. Recently, Barbu et al. introduced a dataset called ObjectNet which includes objects in...
  • LSUN-Church

    Progress in GANs has enabled the generation of high-res-olution photorealistic images of astonishing quality. StyleGANs allow for compelling attribute modification on such...
You can also access this registry using the API (see API Docs).