Dataset - LDM

OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning

The dataset used in the paper is not explicitly described, but it is mentioned that the authors used the ImageNet, Places205, and VOC07 datasets for evaluation.
- Dataset
- JSON
Trypophobia dataset

Dataset used for training and testing Convolutional Neural Networks for detecting trypophobia triggers.
- Dataset
- JSON
V-COCO

The V-COCO dataset contains 2,533 training images, 2,867 validation images, and 4,946 test images, including 24 action classes.
- Dataset
- JSON
COCO Keypoint Benchmark

The COCO keypoint benchmark is a widely used dataset for human pose estimation.
- Dataset
- JSON
Context-and-Spatial Aware Network for Multi-Person Pose Estimation

Multi-person pose estimation is a fundamental yet challenging task in computer vision. Both rich context information and spatial information are required to precisely locate the...
- Dataset
- JSON
Faces Dataset

The dataset used in the paper for testing the GaMeS model, containing images of six people, generated using Blender software from various perspectives, excluding the backs of...
- Dataset
- JSON
Mip-NeRF360 dataset

The dataset used in the paper for testing the GaMeS model, containing 5 outdoor and 4 indoor scenes, each featuring intricate central objects or areas against detailed backgrounds.
- Dataset
- JSON
LSUN Bedroom and LSUN Cat dataset

The LSUN Bedroom and LSUN Cat dataset is a large-scale image dataset used for training and testing the proposed approach.
- Dataset
- JSON
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small...

Skin cancer is one of the most common types of cancer in the world. Different computer-aided diagnosis systems have been proposed to tackle skin lesion diagnosis, most of them...
- Dataset
- JSON
Vision Big Bird

Vision Big Bird: Random Sparsification for Full Attention
- Dataset
- JSON
ScanNet-v2

Learning from bounding-boxes annotations has shown great potential in weakly-supervised 3D point cloud instance segmentation. However, we observed that existing methods would...
- Dataset
- JSON
Youtube Faces (YTF)

The Youtube Faces (YTF) dataset contains 3,424 videos belonging to 1,595 different identities.
- Dataset
- JSON
Labeled Face in the Wild (LFW)

The Labeled Face in the Wild (LFW) dataset contains 13,233 facial images belonging to 5,749 different individuals.
- Dataset
- JSON
CIFAR-10, Tiny ImageNet, and ImageNet

The dataset used in the paper is CIFAR-10, Tiny ImageNet, and ImageNet.
- Dataset
- JSON
REalistic Single Image DEhazing (RESIDE) dataset

The RESIDE dataset is a large-scale dataset for benchmarking single image dehazing algorithms, and it includes both indoor and outdoor hazy images.
- Dataset
- JSON
Pyramid VisionLLaMA: A versatile backbone for dense prediction without convol...

Pyramid VisionLLaMA: A versatile backbone for dense prediction without convolutions.
- Dataset
- JSON
Conditional positional encodings for vision transformers

Conditional positional encodings for vision transformers.
- Dataset
- JSON
Twins: Revisiting the design of spatial attention in vision transformers

Twins: Revisiting the design of spatial attention in vision transformers.
- Dataset
- JSON
Mobilevlm: A fast, reproducible and strong vision language assistant for mobi...

Mobilevlm: A fast, reproducible and strong vision language assistant for mobile devices.
- Dataset
- JSON
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks

VisionLLaMA is a unified and generic modeling framework for solving most vision tasks.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

992 datasets found