-
ILSVRC2012
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used a subset of the validation dataset used for the ImageNet Large Scale Visual... -
PASCAL VOC 2007
Multi-label image recognition is a practical and challenging task compared to single-label image classification. -
Pascal Visual Object Classes (VOC) Challenge
The Pascal Visual Object Classes (VOC) challenge is a benchmark for object detection and segmentation. -
Reading digits in natural images with unsupervised feature learning
The paper presents a method for reading digits in natural images using unsupervised feature learning. -
SVHN Dataset
The dataset used in the paper is a collection of images from the SVHN dataset, along with labels. The dataset is used for image classification. -
VGG-16 Dataset
The VGG-16 dataset is a large collection of images of objects. -
MNIST Database
The MNIST database of handwritten digits is a popular benchmark data set for classification algorithms. -
Multiscale Vision Transformers
Multiscale Vision Transformers (MViT) for video and image recognition, by connecting the seminal idea of multiscale feature hierarchies with transformer models. -
ImageNet Dataset
Object recognition is arguably the most important problem at the heart of computer vision. Recently, Barbu et al. introduced a dataset called ObjectNet which includes objects in... -
CIFAR-10 Dataset
The dataset used in this paper is a neural network, and the authors used it to test the performance of their lookahead pruning method. -
OpenImages
Large-scale vision-and-language models trained on curated and web-scrapped data have led to significant improvements over task-specific models when transferred to downstream... -
An image is worth 16x16 words: Transformers for image recognition at scale
An image is worth 16x16 words: Transformers for image recognition at scale. -
Microsoft COCO
The Microsoft COCO dataset was used for training and evaluating the CNNs because it has become a standard benchmark for testing algorithms aimed at scene understanding and... -
ImageNet Large Scale Visual Recognition Challenge
A benchmark for low-shot recognition was proposed by Hariharan & Girshick (2017) and consists of a representation learning phase without access to the low-shot classes and a...