-
PROPSEGMENT
The PROPSEGMENT dataset is a large-scale corpus for proposition-level segmentation and entailment recognition. -
Sub-sentence encoder
The sub-sentence encoder is a contrastive learning framework for learning contextual embeddings for semantic units on the sub-sentence level. -
MIRFLICKR-25K
The MIRFLICKR-25K dataset consists of 25015 images and 223635 tags, where each image is associated with several textual tags and annotated with a 24-dimensional semantic label.