-
Deep residual learning for image recognition
The ResNet-50 and ResNet-101 are used as the backbone image feature extractor. -
WIDER-Attribute
Human Attribute Recognition (HAR) is a challenging task due to large variations of body gestures, external occlusions, lighting conditions, image resolutions and blurrinesses. -
Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated ...
The proposed Win Transformer achieves consistently superior performance than Swin Transformer on multiple computer vision tasks, including image recognition, semantic... -
Microsoft COCO
The Microsoft COCO dataset was used for training and evaluating the CNNs because it has become a standard benchmark for testing algorithms aimed at scene understanding and... -
ImageNet Large Scale Visual Recognition Challenge
A benchmark for low-shot recognition was proposed by Hariharan & Girshick (2017) and consists of a representation learning phase without access to the low-shot classes and a...