Video Classification - Groups

Long-term Leap Attention, Short-term Periodic Shift for Video Classification

Video transformer naturally incurs a heavier computation burden than a static vision transformer, as the former processes T times longer sequence than the latter under the...

Dataset
JSON

YouTube-8M: A Large-Scale Video Classification Benchmark

YouTube-8M is a large-scale video classification benchmark.

Dataset
JSON

VideoLT

The VideoLT dataset contains 1,004 classes and about 256,218 untrimmed videos collected from YouTube, covering a wide range of human activities, including everyday life,...

Dataset
JSON

15 Scenes

The dataset used in this paper is a benchmark dataset for image and video classification. It contains 15 scenes with 4485 images, and 102 classes with 9144 images. The dataset...

Dataset
JSON

Condensed Movies

The dataset used for text-to-video retrieval and video classification tasks.

Dataset
JSON

Kinetics dataset

The Kinetics dataset is a large-scale action recognition dataset. It contains videos of various actions performed by humans, with annotations of the actions performed.

Dataset
JSON

Kinetics-600

The Kinetics-600 dataset consists of 392k training videos and 30k validation videos in 600 human action categories.

Dataset
JSON

ActivityNet Captions

The ActivityNet Captions is a benchmark dataset proposed for dense video captioning. There are 20K untrimmed videos in total, and each video has several annotated segments with...

Dataset
JSON

MSR-VTT

The dataset used in the paper is MSR-VTT, a large video description dataset for bridging video and language. The dataset contains 10k video clips with length varying from 10 to...

Dataset
JSON

Youtube-8M

Youtube-8M is a large-scale video classification benchmark.

Dataset
JSON

Charades

The dataset used for video action classification, consisting of 9.8k training videos, 1.8k validation videos, and 157 classes.

Dataset
JSON

11 datasets found