-
ShapeNet Annotated with Referring Expressions (SNARE)
A benchmark dataset for grounding natural language referring expressions to distinguish 3D objects. -
ProofWriter
ProofWriter: Generating implications, proofs, and abductive statements over natural language -
Situated Dataset
The situated dataset is a dataset of objects annotated with properties and affordances in real-world images. -
Abstract Dataset
The abstract dataset is a refreshed version of the McRate dataset, pruned and densely annotated to eliminate false negatives present in previous work. -
Fashion IQ
Fashion IQ is a new dataset for research on natural language based image retrieval systems, which is situated in the detail-critical fashion domain. -
ToG: Text of Gaze Dataset
Text-to-gaze dataset containing over 90k text descriptions of human gaze behavior. -
Localizing moments in video with natural language
Localizing moments in video with natural language -
InterHuman
Humanoid Reaction Synthesis is pivotal for creating highly interactive and empathetic robots that can seamlessly integrate into human environments, enhancing the way we live,... -
Clotho: An audio captioning dataset
Audio captioning is a multi-modal task, focusing on using natural language for describing the contents of general audio. Most audio captioning methods are based on deep neural... -
Natural Image-Text Dataset
The dataset used for training the Vary-base model, containing natural image-text pairs.