The VQA 2.0 dataset is used for visual question answering task. It consists of three sets with a train set containing 83k images and 444k questions, a validation set containing...
The CREMA-D dataset is an audio-visual dataset for emotion recognition task, each video in which consists of both facial and acoustic emotional expressions.
The dataset used in the paper is a healthcare dataset containing patient information, including vital signs, lab values, and medication administration. The dataset is used to...