-
vHeat: Building Vision Models upon Heat Conduction
A fundamental problem in learning robust and expressive visual representations lies in efficiently estimating the spatial relationships of visual semantics throughout the entire... -
MPII Human Pose Dataset
Human pose estimation refers to the task of recognizing postures by localizing body keypoints (head, shoulders, elbows, wrists, knees, ankles, etc.) from images. -
COCO, ADE20K, PASCAL Context, and LVIS datasets
COCO dataset, ADE20K dataset, PASCAL Context dataset, LVIS dataset -
Tetrahedron Splatting for 3D Generation
The dataset used in the paper for 3D generation using TeT-Splatting. -
Independent Sign Language Recognition with 3D Body, Hands, and Face Reconstru...
Independent Sign Language Recognition is a complex visual recognition problem that combines several challenging tasks of Computer Vision due to the necessity to exploit and fuse... -
Skin Cancer MNIST (HAM10000) dataset
The Skin Cancer MNIST (HAM10000) dataset is a good use case to assess the capabilities of attention mechanisms in neural networks. -
Dataset Distillation by Automatic Training Trajectories
Dataset Distillation by Automatic Training Trajectories -
Wide-area image geolocalization with aerial reference imagery
The CVUSA and CVACT datasets are used for cross-view geolocalization. The VIGOR dataset is used for cross-view image retrieval and 3-DoF pose estimation. -
C-BEV: Contrastive Bird’s Eye View Training for Cross-View Image Retrieval an...
The CVUSA and CVACT datasets are used for cross-view geolocalization. The VIGOR dataset is used for cross-view image retrieval and 3-DoF pose estimation. -
Synthetic-Neuroscore: Using a neuro-AI interface for evaluating generative ad...
Generative adversarial networks (GANs) are increasingly attracting attention in the computer vision, natural language processing, speech synthesis and similar domains. However,... -
Neuro-AI Interface for Evaluating Generative Adversarial Networks
Generative adversarial networks (GANs) are increasingly attracting attention in the computer vision, natural language processing, speech synthesis and similar domains. However,... -
Animation Line Art Colorization Dataset
A dataset for animation line art colorization, consisting of 10 different cartoon films with diverse and intense frame variations. -
DEEVA: A Deep Learning and IoT Based Computer Vision System
A deep learning and IoT based computer vision system to process computer vision and natural language in real time in order to address the safety and security of production sites... -
SaLite: A light-weight model for salient object detection
Salient object detection is a prevalent computer vision task that has applications ranging from abnormality detection to abnormality processing. Context modelling is an... -
Defects of Convolutional Decoder Networks in Frequency Representation
The dataset used in the paper to prove the representation defects of a cascaded convolutional decoder network in frequency representation. -
HCI 4D Light Field Benchmark
A dataset and evaluation methodology for depth estimation on 4D light fields. -
Wide-Baseline Light Field Depth Estimation with EPI-Shift
A method for depth estimation from light field data, based on a fully convolutional neural network architecture. -
SqueezeJet: High-level Synthesis Accelerator
Deep convolutional neural networks have dominated the pattern recognition scene by providing much more accurate solutions in computer vision problems such as object recognition...