-
ImageNet-50/100/200
The dataset used in the paper is not explicitly mentioned, but it is implied to be ImageNet-50/100/200 for ImageNet-50/100/200 classification. -
Occluded CIFAR
The dataset used in the paper is Occluded CIFAR. -
Cluttered MNIST and CIFAR-10
The dataset used in the paper is Cluttered MNIST and CIFAR-10. -
Counting Objects by Diffused Index
Counting objects is a fundamental but challenging problem. In this paper, we propose diffusion-based, geometry-free, and learning-free methodologies to count the number of... -
ImageNet-32
The ImageNet-32 dataset is a subset of the ImageNet dataset, containing 1,281,167 training samples and 50,000 test samples, distributed across 1,000 labels. -
DTU MVS Dataset
The DTU MVS dataset contains 49 images of physical objects in real environments. -
CIFAR10 and ImageNet
The dataset used in the paper to evaluate the alignment of deep neural networks with human perception. -
Fishyscapes
Fishyscapes: A benchmark for safe semantic segmentation in autonomous driving with annotations for pedestrian and vehicle detection. -
MUAD: Multiple Uncertainties for Autonomous Driving
MUAD: A synthetic dataset for autonomous driving with multiple uncertainties and annotations for semantic segmentation, depth estimation, object detection, and instance... -
Container: A General-Purpose Building Block for Multi-Head Context Aggregation
Convolutional neural networks (CNNs) are ubiquitous in computer vision, with a myriad of effective and efficient variations. Recently, Transformers – originally introduced in... -
Mutant and LEGO Dataset
The Mutant and LEGO dataset is a dynamic scene dataset. It contains 90% images for training and 10% images for evaluation. -
Tanks and Temples Advanced (T&T) Dataset
The Tanks and Temples Advanced (T&T) dataset is a benchmark dataset for image-based 3D reconstruction. It contains 90% images for training and 10% images for evaluation. -
Visual Wake Words (VWW) dataset
The Visual Wake Words (VWW) dataset consists of high-resolution images that include visual cues to 'wake-up' AI-powered home assistant devices. -
HSViT: Horizontally Scalable Vision Transformer
This paper introduces a horizontally scalable vision transformer (HSViT) scheme with a novel image-level feature embedding. The design of HSViT preserves the inductive bias from... -
ADE20k for semantic segmentation
The dataset used in this paper is ADE20k for semantic segmentation. -
SPL2018 and DsTok datasets for computer-generated image detection
The SPL2018 and DsTok datasets for computer-generated image detection -
Dual Stream Computer-Generated Image Detection Network Based on Channel Joint...
The proposed dual stream convolutional neural network for computer-generated image detection -
CNN Models
The dataset used in this paper is a large variety of popular CNN models, such as straight-forward, complicated-connected, and grouped architectures. -
ModelNet40, ModelNet10
The dataset used in the paper is ModelNet40 and ModelNet10, which are subsets of ShapeNet. -
ShapeNet, ModelNet40, ModelNet10
The dataset used in the paper is ShapeNet, a large-scale dataset of 3D models, and ModelNet40 and ModelNet10, which are subsets of ShapeNet.