Dataset - LDM

BMLrub

BMLrub is a dataset of human motion and video.
- Dataset
- JSON
UVG dataset

The dataset used in the paper is not explicitly described, but it is mentioned that the authors tested their model on various signal reconstruction tasks: 1D sinusoidal...
- Dataset
- JSON
CelebV-HQ

The CelebV-HQ dataset contains 35,666 face videos from over 15,000 identities.
- Dataset
- JSON
UCF-Crime and XD-Violence datasets

The UCF-Crime and XD-Violence datasets are used for weakly supervised video anomaly detection.
- Dataset
- JSON
ActivityNet v1.2

Weakly-Supervised Temporal Action Localization (WSTAL) aims to localize actions in untrimmed videos with only video-level labels.
- Dataset
- JSON
CamVid Dataset

CamVid dataset is a benchmark dataset for semantic segmentation. It consists of 700 images with 11 object classes.
- Dataset
- JSON
BDD100K Dataset

BDD100K Dataset is a large-scale dataset for autonomous driving, containing 100,000 images, with 20,000 images for training and 80,000 images for testing.
- Dataset
- JSON
IG65M

The dataset used in the paper for self-supervised learning of video representations.
- Dataset
- JSON
HMDB51, UCF101, Kinetics-400, and IG65M

Four datasets are relevant for our study: HMDB51, UCF101, Kinetics-400, and IG65M.
- Dataset
- JSON
DAVIS and YouTube-VOS datasets

The DAVIS dataset comprises 60, 30 and 30 video sequences for training, validation and testing, respectively. The YouTube-VOS dataset is a larger dataset with 3471 videos for...
- Dataset
- JSON
Every Moment Counts

The Every Moment Counts dataset contains 1,000 hours of video footage of various activities.
- Dataset
- JSON
Hollywood in Homes

The Hollywood in Homes dataset contains 9,848 videos of daily activities across 157 classes.
- Dataset
- JSON
THUMOS Challenge

The THUMOS Challenge dataset contains 413 sports videos of 65 classes.
- Dataset
- JSON
Dual DETRs for Multi-Label Temporal Action Detection

Temporal Action Detection (TAD) aims to identify the action boundaries and the corresponding category within untrimmed videos.
- Dataset
- JSON
Multi Visual Modality Fall Detection Dataset (MUVIM)

The Multi Visual Modality Fall Detection Dataset (MUVIM) was used for anomaly detection of falls. It contains (6) vision-based sensors of different modalities including thermal,...
- Dataset
- JSON
UCF-101 dataset

UCF-101 dataset is a large-scale action recognition dataset, containing 13,320 videos categorized into 101 human action categories.
- Dataset
- JSON
Jester

The Jester dataset is of continuous jokes ratings from -10 to 10, containing the jokes’ texts.
- Dataset
- JSON
DAVIS

The DAVIS dataset is a widely used dataset for video-related tasks, consisting of approximately 2000 frames from 26 human-centric scenarios.
- Dataset
- JSON
MUSES

The MUSES dataset is a collection of 3,697 videos, with 2,587 for training and 1,110 for testing.
- Dataset
- JSON
MultiTHUMOS

Temporal action localization (TAL) is a prevailing task due to its great application potential. Existing works in this field mainly suffer from two weaknesses: (1) They often...
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

86 datasets found