Video Understanding - Groups

TVQA

TVQA is a video question answering dataset collected from 6 long-running TV shows from 3 genres. There are 21,793 video clips in total for QA collection, accompanied with...

Dataset
JSON

TGIF-QA

The TGIF-QA dataset consists of 165165 QA pairs chosen from 71741 animated GIFs. To evaluate the spatiotemporal reasoning ability at the video level, TGIF-QA dataset designs...

Dataset
JSON

Kinetics-400

Motion has shown to be useful for video understanding, where motion is typically represented by optical flow. However, computing flow from video frames is very time-consuming....

Dataset
JSON

MSR-VTT

The dataset used in the paper is MSR-VTT, a large video description dataset for bridging video and language. The dataset contains 10k video clips with length varying from 10 to...

Dataset
JSON

UCF101

The UCF101 dataset contains 13320 videos distributed in 101 action categories. This dataset is different from the above ones in that it contains mostly coarse sports activities...

Dataset
JSON

5 datasets found

TVQA

TGIF-QA

Kinetics-400

MSR-VTT

UCF101