Video Recognition - Groups

PortraitMode-400

PortraitMode-400 is a dataset dedicated to portrait mode video recognition, with a fine-grained taxonomy of 400 categories.

Dataset
JSON

VideoLT

The VideoLT dataset contains 1,004 classes and about 256,218 untrimmed videos collected from YouTube, covering a wide range of human activities, including everyday life,...

Dataset
JSON

LocalStyleFool

LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything

Dataset
JSON

UCF101 and HMDB51 datasets

The UCF101 and HMDB51 datasets are used for video recognition. The UCF101 dataset contains 101 action categories, while the HMDB51 dataset contains 51 classes.

Dataset
JSON

Kinetics-400, Something-Something V2, Epic-Kitchens-100, HMDB51, and UCF101

The dataset used in the paper is a video recognition benchmark, specifically Kinetics-400, Something-Something V2, Epic-Kitchens-100, HMDB51, and UCF101.

Dataset
JSON

Diving48

The Diving48 dataset is a fine-grained video dataset of competitive diving. It has ∼18k trimmed video clips of 48 unambiguous dive sequences standardized by the professional....

Dataset
JSON

Mini-Kinetics

The Mini-Kinetics dataset is a mini version of the Kinetics-400 dataset, containing 240k training samples and 20k validation samples in 400 human action classes.

Dataset
JSON

HowTo100M

The dataset used in the LORD framework for autonomous driving, consisting of images, videos, and text-based observations.

Dataset
JSON

Moments in Time

The Moments in Time dataset is a large-scale video action recognition dataset.

Dataset
JSON

MoViNets: Mobile Video Networks for Efficient Video Recognition

Mobile Video Networks (MoViNets) is a family of computation and memory efficient video networks that can operate on streaming video for online inference.

Dataset
JSON

Moving MNIST

Moving MNIST is a benchmark data set for video recognition. There are 10,000 samples including 8,000 for training and 2,000 for test. Each sample consists of 20 sequential gray...

Dataset
JSON

Jester

The Jester dataset is of continuous jokes ratings from -10 to 10, containing the jokes’ texts.

Dataset
JSON

Something-Something V1

Video classification is a fundamental problem in many video-based tasks. Applications such as autonomous driving technology, controlling drones and robots are driving the demand...

Dataset
JSON

Temporal-attentive Covariance Pooling Networks for Video Recognition

Video recognition aims to automatically analyze the contents of videos (e.g., events and actions), and has a wide range of applications, including intelligent surveillance,...

Dataset
JSON

Kinetics-600

The Kinetics-600 dataset consists of 392k training videos and 30k validation videos in 600 human action categories.

Dataset
JSON

Multi-Fiber Networks for Video Recognition

The proposed multi-ﬁber architecture is used for reducing the computational cost of spatio-temporal deep neural networks, making them run as fast as their 2D counterparts while...

Dataset
JSON

Kinetics-400

Motion has shown to be useful for video understanding, where motion is typically represented by optical flow. However, computing flow from video frames is very time-consuming....

Dataset
JSON

Something-Something V1 & V2

The Something-Something V1 & V2 dataset is a large-scale video dataset created by crowdsourcing. It contains about 100k videos over 174 categories, and the number of videos...

Dataset
JSON

UCF101

The UCF101 dataset contains 13320 videos distributed in 101 action categories. This dataset is different from the above ones in that it contains mostly coarse sports activities...

Dataset
JSON

Kinetics-700

Kinetics-700 is a large-scale video dataset for human action recognition, with 700 action categories.

Dataset
JSON

26 datasets found