-
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transfo...
Semantic segmentation is a fundamental task in computer vision and enables many downstream applications. It is related to image classification since it produces per-pixel... -
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Tra...
The dataset used in this paper is ImageNet and SQuAD and GLUE datasets. -
Surf-NeRF: A modified implementation of S-NeRF for surface reconstruction fro...
A modified implementation of the Shadow Neural Radiance Field (S-NeRF) model for surface reconstruction from satellite images. -
MPViT: Multi-Path Vision Transformer for Dense Prediction
Dense computer vision tasks such as object detection and segmentation require effective multi-scale feature representation for detecting or classifying objects or regions with... -
Multi-View HDR Datasets
High dynamic range (HDR) novel view synthesis (NVS) aims to create photorealistic images from novel viewpoints using HDR imaging techniques. -
TidySim: A 3D object rearrangement simulator
The dataset is a collection of 75 user-generated scenes for a tidying task, where users are asked to arrange objects in a tidy manner. -
ETH3D Stereo Dataset
A benchmark for stereo matching, consisting of 50 stereo pairs with ground truth disparity maps. -
Middlebury Stereo Dataset v3
A benchmark for stereo matching, consisting of 11 stereo pairs with ground truth disparity maps. -
KITTI 2012 Stereo Vision Benchmark
A benchmark for stereo matching, consisting of 75 stereo pairs with ground truth disparity maps. -
SCAPE dataset
3D shape analysis is an important research topic in computer vision and graphics. The dataset used in this paper is a collection of 3D shapes with the same connectivity to train... -
rotated MNIST, CIFAR-10, and PatchCamelyon
The dataset used in the paper is not explicitly described. However, it is mentioned that the authors used the rotated MNIST, CIFAR-10, and PatchCamelyon datasets. -
Learning to predict 3D surfaces of sculptures from single and multiple views
The Learning to predict 3D surfaces of sculptures from single and multiple views dataset is a dataset for predicting 3D shapes of sculptures from a single or multiple images. -
Through the Looking Glass
The Through the Looking Glass dataset is a dataset for predicting 3D shapes of transparent shapes from a single image. -
TransProteus
The TransProteus dataset is a synthetic dataset for predicting 3D shapes, masks, and properties of materials, liquids, and objects inside transparent containers. -
The KITTI Vision Benchmark Suite
A benchmark suite for 3D vision tasks. -
Street View Synthesis with Gaussian Splatting and Diffusion Prior
Novel View Synthesis (NVS) for street scenes in autonomous driving scenarios. -
Content-Aware Convolutional Neural Networks
Convolutional Neural Networks (CNNs) have achieved great success due to the powerful feature learning ability of convolution layers. Specifically, the standard convolution... -
MobileNetV1 and MobileNetV2
The dataset used in this paper is MobileNetV1 and MobileNetV2, which are commonly used vision backbones for custom ML hardware. -
MIT-Adobe FiveK dataset
The dataset used for training and testing the proposed NICER approach for aesthetic image enhancement. -
CLE Diffusion: Controllable Light Enhancement Diffusion Model
Low-light image enhancement has gained increasing importance with the rapid development of visual creation and editing. However, most existing enhancement algorithms are...