-
SemEval 2014 Task 4 dataset
The SemEval 2014 task 4 dataset contains labeled sentences and sentence-aspect pairs for aspect-term sentiment analysis, focusing on specific domains such as restaurants and... -
Hands 2017 challenge dataset
The Hands 2017 challenge dataset contains depth images used for training and testing 3D hand pose estimation methods, with a focus on various hand shapes and poses. -
WMT 2014 English-to-French Dataset
The WMT 2014 English-to-French dataset contains 36 million sentence pairs that are used to benchmark translation models. -
WMT 2014 English-to-German Dataset
The WMT 2014 English-to-German dataset consists of 4.5 million sentence pairs used for neural machine translation. -
General Language Understanding Evaluation (GLUE) benchmark
GLUE is a multi-task benchmark that contains a diverse set of natural language understanding tasks including sentiment analysis, natural language inference, and textual... -
IWSLT'14 German to English Translation Dataset
IWSLT’14 (International Workshop on Spoken Language Translation) German to English dataset consists of parallel sentences for machine translation tasks, containing approximately... -
University of Maryland Reddit Suicidality Dataset
The University of Maryland Reddit Suicidality Dataset contains Reddit posts from the r/SuicideWatch subreddit, used to assess suicidality risk based on user postings. -
CSMSC Dataset
The CSMSC dataset is a corpus for Mandarin Chinese speech synthesis research. -
JVS Corpus
JVS corpus is a free Japanese multi-speaker voice corpus, used for various speech synthesis tasks. -
Jacquard Dataset
The Jacquard dataset is a large-scale dataset for robotic grasp detection, featuring dense grasp rectangle annotations. -
Cornell Grasping Dataset
The Cornell Grasping Dataset (CGD) contains manually-labeled grasp annotations for a limited number of examples, focusing on detecting robotic grasps. -
WMT English-German Translation
WMT English-German translation task is used for supervised conditional language generation, where the authors assess the model's performance in translating from English to German. -
MTG-Jamendo Dataset
The MTG-Jamendo dataset is used for automatically recognizing the emotions and themes in music recordings based on the raw audio, focusing on mood and theme tagging. -
Cornell Movie Dialogues
The Cornell Movie Dialogues dataset features two-character dialogues from movie scripts, capturing a large variety of human interaction in many different fictional circumstances. -
MalwareTextDB
The MalwareTextDB corpus consists of APT reports describing malware related information for text classification and token label prediction tasks. -
CelebA-HQ 256x256
The 256x256 CelebA-HQ dataset is utilized to train an Image Transformer for autoregressive image generation.