-
MS COCO dataset
The MS COCO dataset is a large benchmark for image captioning, containing 328K images with 5 caption descriptions each. -
ImageNet: A Large-Scale Hierarchical Image Database
The ImageNet dataset is a large-scale image database that contains over 14 million images, each labeled with one of 21,841 categories. -
Hyperhuman
Hyperhuman dataset for 3D face rendering -
ImageNet21K
The ImageNet21K dataset is used for training and evaluation of the proposed Circulant Channel-Specific (CCS) token-mixing MLP. -
MS COCO 2017
The dataset used in this paper is a collection of frames for video coding, with different Quantisation Parameters (QPs) and frame types. -
Omniglot dataset
The Omniglot dataset consists of 100 classes, each containing 20 images. Ten images were taken from each class for augmentation, and the rest were used as the test set. Each... -
Blender Dataset
The Blender dataset consists of 8 synthetic 3D scenes, each with a hundred posed images of resolution 800 × 800. -
TinyImagenet dataset
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used TinyImagenet dataset for pre-training the embedding functions. -
CIFAR-10, CIFAR-100, and STL-10 datasets
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used CIFAR-10, CIFAR-100, and STL-10 datasets for training and testing the... -
Surface Networks
The dataset used in the paper is a 3D mesh dataset, which is used for training and testing the Surface Networks model. -
Neural 3D Mesh Renderer
The dataset used in the paper Neural 3D Mesh Renderer. The dataset consists of 3D models of objects. -
Human Action Recognition
The Human Action Recognition dataset is used for human action recognition tasks. -
MV-VTON: Multi-View Virtual Try-On with Diffusion Models
The proposed method for Multi-View Virtual Try-On (MV-VTON) task, which aims at using the frontal and back clothing to reconstruct the dressing results of a person from multiple...