-
Using Street View and Satellite Images to Estimate House Prices
Street image and satellite image data can capture urban qualities and improve house price estimation. -
MCA: Moment Channel Attention Networks
Channel attention mechanisms endeavor to recalibrate channel weights to enhance representation abilities of networks. -
Robot Trajectories Dataset
The dataset consists of 80,000 robot trajectories collected via human teleoperation, with 2,800 demonstrations labeled by crowd-sourced language annotators. -
Data-driven Instruction Augmentation for Language-conditioned Control
Data-driven Instruction Augmentation for Language-conditioned Control (DIAL) is a method that uses pre-trained vision-language models (VLMs) to label offline datasets for... -
DVGO dataset
The dataset used in the paper is the DVGO dataset, which contains 3D scenes and is used for training and testing the proposed neural radiance field model. -
LF dataset
The dataset used in the paper is the LF dataset, which contains real-world scenes and is used for training and testing the proposed neural radiance field model. -
KITTI 2012 and 2015 datasets
The KITTI 2012 and 2015 datasets are used for stereo matching experiments. -
COCO object detection and instance segmentation, ADE20K semantic segmentation
The dataset used in the paper is the COCO object detection and instance segmentation dataset, and the ADE20K semantic segmentation dataset. -
Towards Accurate BNNs via Modeling Contextual Dependencies
The proposed binary model is built on modeling contextual dependencies. An overview of the BCDNet architecture is illustrate in Fig. 4. We first describe the fundamentals of... -
Adversarial Counterfactual Visual Explanations
Counterfactual explanations and adversarial attacks have a related goal: flipping output labels with minimal perturbations regardless of their characteristics. -
Pascal3D and ShapeNet
The dataset used in the paper for object detection, self-driving, and UAV racing tasks. -
CIFAR-100 and ImageNet datasets
The dataset used in the paper is the CIFAR-100 and ImageNet datasets. -
Vision-and-Language Navigation
The Vision-and-Language Navigation (VLN) task gives a global natural sentence I = {w0,..., wl} as an instruction, where wi is a word token while the l is the length of the... -
Waymo Open Dataset and nuScenes Dataset
The Waymo Open Dataset and the nuScenes Dataset are used to evaluate the performance of the AFDetV2 model. -
Symmetric parallax attention for stereo image super-resolution
Symmetric parallax attention for stereo image super-resolution. -
Learning parallax attention for stereo image super-resolution
Learning parallax attention for stereo image super-resolution. -
Feedback network for mutually boosted stereo image super-resolution and dispa...
Stereo image super-resolution aims at enhancing the quality of super-resolution results by utilizing the complementary information provided by binocular systems. -
NAFSSR: Stereo Image Super-Resolution Using NAFNet
Stereo image super-resolution aims at enhancing the quality of super-resolution results by utilizing the complementary information provided by binocular systems.