-
Pascal VOC
Semantic segmentation is a crucial and challenging task for image understanding. It aims to predict a dense labeling map for the input image, which assigns each pixel a unique... -
Various Datasets
The datasets used in the paper are described as follows: WikiMIA, BookMIA, Temporal Wiki, Temporal arXiv, ArXiv-1 month, Multi-Webdata, LAION-MI, Gutenberg. -
Inception Transformer
Recent studies show that Transformer has strong capability of building long-range dependencies, yet is incompetent in capturing high frequencies that predomi- nantly convey... -
MOCA: Masked Online Codebook Assignments prediction
Self-supervised representation learning for Vision Transformers (ViT) to mitigate the greedy needs of ViT networks for very large fully-annotated datasets. -
ShapeNetPart
The dataset used in the paper is ShapeNetPart, a synthetic dataset for 3D object part segmentation. It contains 16,881 models from 16 categories. -
PCB Component Detection using Computer Vision for Hardware Assurance
The dataset used in this study is a semantic PCB image dataset, which contains images of PCBs with annotated components. -
CIFAR10 and CIFAR100
The dataset used in the paper is not explicitly described, but it is mentioned that the authors conducted experiments on various vision tasks, including image classification,... -
N-object dataset testing
An N-object dataset for testing the proposed framework -
Synthetic-NSVF
The dataset used in the paper SpikingNeRF: Making Bio-inspired Neural Networks See through the Real World -
Synthetic-NeRF
The dataset used in the paper SpikingNeRF: Making Bio-inspired Neural Networks See through the Real World -
Compute trends across three eras of machine learning
A dataset of 650 machine learning models presented in academic publications and relevant gray literature. -
Multiscale Vision Transformers
Multiscale Vision Transformers (MViT) for video and image recognition, by connecting the seminal idea of multiscale feature hierarchies with transformer models. -
DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training
We propose DisCo-CLIP, a distributed memory-efficient CLIP training approach, to reduce the memory consump- tion of contrastive loss when training contrastive learning models. -
ImageNet Dataset
Object recognition is arguably the most important problem at the heart of computer vision. Recently, Barbu et al. introduced a dataset called ObjectNet which includes objects in... -
LSUN-Church
Progress in GANs has enabled the generation of high-res-olution photorealistic images of astonishing quality. StyleGANs allow for compelling attribute modification on such...