-
MV-VTON: Multi-View Virtual Try-On with Diffusion Models
The proposed method for Multi-View Virtual Try-On (MV-VTON) task, which aims at using the frontal and back clothing to reconstruct the dressing results of a person from multiple... -
Dynamic Approach for Lane Detection using Google Street View and CNN
A dataset of 2000 RGB images for lane detection using SegNet architecture. -
NeRFBuster
A real-world dataset captured by mobile phones and containing quite complex trajectories. -
Oxford RobotCar Dataset
The Oxford RobotCar Dataset is a collection of images and videos of a car driving on various roads and conditions. -
MNIST, CIFAR-10, CIFAR-100, Tiny-ImageNet, VGG-like
The dataset used in the paper is MNIST, CIFAR-10, CIFAR-100, Tiny-ImageNet, and VGG-like. -
Cambridge Landmarks
The Cambridge Landmarks dataset contains 5 different large outdoor scenes of landmarks in the city of Cambridge. -
Sparse Resnet50 model
The dataset used in this paper is a sparse Resnet50 model, which is a variant of the Resnet50 model with 80% sparsity. -
Two-level Group Convolution
The proposed two-level group convolution is suitable for distributed memory computing and robust with respect to the large number of groups. -
Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated ...
The proposed Win Transformer achieves consistently superior performance than Swin Transformer on multiple computer vision tasks, including image recognition, semantic... -
ANTNets: Mobile Convolutional Neural Networks for Resource Efficient Image Cla...
Deep convolutional neural networks have achieved remarkable success in computer vision. However, deep neural networks require large computing resources to achieve high... -
Traffic Signs dataset
The Traffic Signs dataset contains 39252 training images in 43 classes. -
Pose-Aware Video Transformers
Human perception of surroundings is often guided by the various poses present within the environment. Many computer vision tasks, such as human action recognition and robot... -
Cap3D dataset
The Cap3D dataset is a large-scale dataset of 3D models with captions. -
Objaverse-LVIS dataset
The Objaverse-LVIS dataset contains ∼ 46,000 3D models in 1,156 categories. -
ImageNet-1000
The dataset used in this paper is ImageNet-1000 pre-trained CNNs. -
Attentive Normalization
The proposed Attentive Normalization (AN) that aims to harness the best of feature normalization and feature attention in a single lightweight module.