Dataset - LDM

Two-level Group Convolution

The proposed two-level group convolution is suitable for distributed memory computing and robust with respect to the large number of groups.
- Dataset
- JSON
Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated ...

The proposed Win Transformer achieves consistently superior performance than Swin Transformer on multiple computer vision tasks, including image recognition, semantic...
- Dataset
- JSON
ResNet-50

The dataset used in the paper is the ResNet-50 dataset, a convolutional neural network model.
- Dataset
- JSON
ANTNets: Mobile Convolutional Neural Networks for Resource Efﬁcient Image Cla...

Deep convolutional neural networks have achieved remarkable success in computer vision. However, deep neural networks require large computing resources to achieve high...
- Dataset
- JSON
Traffic Signs dataset

The Traffic Signs dataset contains 39252 training images in 43 classes.
- Dataset
- JSON
Pose-Aware Video Transformers

Human perception of surroundings is often guided by the various poses present within the environment. Many computer vision tasks, such as human action recognition and robot...
- Dataset
- JSON
Cap3D dataset

The Cap3D dataset is a large-scale dataset of 3D models with captions.
- Dataset
- JSON
Objaverse-LVIS dataset

The Objaverse-LVIS dataset contains ∼ 46,000 3D models in 1,156 categories.
- Dataset
- JSON
ImageNet-1000

The dataset used in this paper is ImageNet-1000 pre-trained CNNs.
- Dataset
- JSON
Attentive Normalization

The proposed Attentive Normalization (AN) that aims to harness the best of feature normalization and feature attention in a single lightweight module.
- Dataset
- JSON
Graph Edit Distance

Graph Edit Distance as a quadratic assignment problem.
- Dataset
- JSON
Binarized MNIST

We use the preprocessed binarized MNIST dataset from [49] which has a split of 50k/10k/10k.
- Dataset
- JSON
MNIST and CIFAR-10 datasets

The MNIST and CIFAR-10 datasets are used to test the theory suggesting the existence of many saddle points in high-dimensional functions.
- Dataset
- JSON
ImageNet, ImageNet ReaL, ImageNet V2, etc.

The dataset used in the paper is not explicitly described. However, it is mentioned that the authors used various benchmarks such as ImageNet, ImageNet ReaL, ImageNet V2, etc.
- Dataset
- JSON
VideoAttentionTarget

VideoAttentionTarget is a video-based gaze target dataset comprising 71,666 frames from 1,331 clips.
- Dataset
- JSON
GazeFollow

GazeFollow is a large-scale dataset consisting of 122,143 images with 130,339 annotations on head-target instances.
- Dataset
- JSON
GazeHTA: End-to-end Gaze Target Detection with Head-Target Association

Gaze target detection aims to directly associate individuals and their gaze targets within a single image or across multiple video frames.
- Dataset
- JSON
AlexNet

The dataset used in the paper is the AlexNet dataset, which contains 60,000 32x32 color images in 10 classes, with 6,000 images per class.
- Dataset
- JSON
DINOv2: Learning robust visual features without supervision

The authors propose a method for self-supervised representation learning using knowledge distillation and vision transformers.
- Dataset
- JSON
Diffusion Classifier

The authors propose a method for zero-shot classification that leverages conditional density estimates from text-to-image diffusion models.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

992 datasets found